Satellite Measurements #2: Arima

Here’s an interesting little graphic and analysis of the new satellite data. It’s hard not to scratch your head sometimes at this entire subject matter, when you see the effect of pretty simple alternatives. The satellite data is modelled very nicely as ARMA c(1,0,1), which would imply entirely different conclusions about this data set and completely different projections.

Figure 1. Global Satellite (downloaded August 9, 2005). Black- raw data; red- arima c(1,0,1) fit and projections; blue – "trend". Updated below.

If you simply look at a plot of the satellite data, it looks like data generated from an arima (or even an ARMA) process. Arima models are about 2nd year statistics and not much harder than trend lines and well within the reach of even climate scientists. In very rough terms, an ARMA model is a bit like your bankroll when you’re gambling in a "fair" game. The fit with an ARMA process (red) with only two parameters (the same number as fitting the "trend" line) is obviously pretty spectacular as compared with the trend. For those interested, the log-likelihood of the ARMA model is 250.7, as compared with the logLik for the trend model of only 95.0 (this more or less quantified the better fit of the red than the blue.)

The ARMA coefficients are AR1 of +0.9257 and MA1 of -0.3183. This is a very high AR1 coefficient; I’m going to be posting about AR1 coefficients in some upcoming posts. The projection using the ARMA process is shown: it is a "fair" game and reverts to 0, with a 2-sigma confidence interval of 0.39. deg C. The "trend" is 0.122 per decade. The difference between the two models illustrates a very important statistical point in a simple case: the residuals from the trend are not white noise, and so the "trend" model is mis-specificied in statistical terms. I’ve not shown here the mis-specification in the "trend model" through the trend residuals – I’ll have to look up how that’s done. However, the counter-example with a superb fit with a c(1,0,1) ARMA error structure is really quite remarkable.

The obvious question is: does this data set show any statistically significant trend at all? It’s not the issue that people are discussing. The appropriate comparison is to ARMA models, as shown here. I’m not familiar with satellite literature and perhaps this ARMA modeling is old hat to them, but it’s sure not at the tip of their tongues (both sides) or we would be seeing quite different discussions about "trends". I’m not saying that there isn’t a trend here – only that the statistical methods being used appear to be so naive that no light is being shed on the matter – actually it’s worse than no light. To examine the matter from first principles really needs to be done.

UPDATE (Evening) : I did a new ARMA run adding a time-term for regression. The regression coefficient (with ARMA errors) was 0.0000223329 deg C/decade and the logLikelihood was essentially unchanged at 251.05 (versus 250.71.) Obviously, in statistical terms, you cannot reject the null hypothesis of no trend. I re-emphasize that the usual "trend" calculation looks to be mis-specified. This is a very routine calculation that I’ve done and you would think that it would have been discussed somewhere in the huffing and puffing about satellite versus surface. We shall say. In the mean time, I don’t see anything wrong with the calculation that I’ve done; it "looks" logical as well.

Plot Script


  1. John A
    Posted Aug 10, 2005 at 8:42 AM | Permalink

    This is what I’ve wondered: Is the linear trend statistically meaningful?

  2. John Hekman
    Posted Aug 10, 2005 at 9:36 AM | Permalink

    This is great. The way that I have seen ARMA models used is first to come up with a theoretical model of the problem and then use a regression package to estimate it along with one or more AR terms as determined by the package. So what is suggested here is to run one model with sunspot time series data and see if the sunspots are significant when AR terms are used to measure the auto-regressive components and check for stationarity. Then a second model could be run with a time series of CO2 in the atmosphere. Which model would work better? That’s at least more than has been done so far.

  3. Greg F
    Posted Aug 10, 2005 at 9:47 AM | Permalink

    This is what I’ve wondered: Is the linear trend statistically meaningful?

    No, because climate is an oscillating system. The simplest oscillator is a sinusoidal, which obviously has no trend. One cycle of a sinusoidal is 2*Pi radians long. In a single cycle the only points that will result in a linear trend of 0 is at Pi and 2Pi radians. In more general terms the only time a trend will produce a slope of 0 for a sinusoidal is when the last data point is an integer multiple of Pi Radians from the first data point.

  4. John Cross
    Posted Aug 10, 2005 at 9:54 AM | Permalink

    Steve: While I am not an expert I suspect that the satellite data has been looked in detail. For example consider the following:

    The apparent discrepancy in global warming rates as
    measured by the surface network and the satellite MSU
    system has been oft-noted and subject to vigorous scientific
    discussion (Michaels et al. 2000, National Research
    Council 2000, Santer et al. 2001).

    written by a Dr. Ross McKitrick.

  5. Steve McIntyre
    Posted Aug 10, 2005 at 10:23 AM | Permalink

    John C, I would distinguish between being “subject to vigorous scientific discussion” and being “looked at in detail”. For example, Jones won’t disclose the surface station data so this analysis is limited. Again, I don’t pretend to know this area in detail. So if you’re aware of any ARMA models applied to the satellite data, I’d appreciate a reference. You’ll have to agree that the ARMA model yields a much better fit than a “trend” model and given the very good ARMA fit, it’s not obvious that there is a trend, rather than an ARMA process.

  6. Posted Aug 10, 2005 at 10:31 AM | Permalink

    In my view both models are incorrect.

    Simple Linear Trend:

    We can see that the data has clear oscillations. Hence graphing the errors from the linear model will show that the model is clearly mis-specified. Adding in terms that can explain the oscillations is the best way to go.


    The predictions based on this model look completely unrealistic. We about 25 years of historical data showing oscillations, yet the predictions from the the ARIMA/ARMA model show none of this. Clearly this model lacks predictive power even moreso than the linear trend, IMO, and hence is actually worse.

    The best thing is to augment the linear trend with additional explanatory variables. Ideally, it would be best to replace the trend variable with another vaiable that explains the trend. One obvious candidate should be the amount of CO2 in the atmosphere. The theory is that increased GHG, with CO2 being the primary bad driver, is resulting in the warming trend. AFAIK no modelling like this has been done. If anybody has any links to something like this I sure would appreciate it.


    Can you post a link to the data?

  7. Posted Aug 10, 2005 at 10:37 AM | Permalink

    Also, we could try augmenting the ARMA model with explanatory variables that account for the oscillations, i.e. an ARMAX model. Of course, once we do this the autoregressive and moving average components could take a hit in terms of magnitude and statistical significance.

    I’d also like to point out that while statistical significance is important it is, IMO, ultimately irrelevant. What we want isn’t a model that is statistically significant (I can get that by simply exhausting the degrees of freedom R2 = 1 Yay!). What we want is a model with good predictive power. While a model that has good predictive power would have a good fit (overall, the issue of multi-collinearity could still render the individual variables insignificant in statistical terms), not every model that has good statistical fit is going to have good predictive power.

    This was basically one of Steve’s main points in regards to Preisondorfer’s rule N post. Statistical significance while necessary is not sufficient.

  8. Posted Aug 10, 2005 at 10:44 AM | Permalink

    Douglass, David H, B. David Clader, and R.S. Knox , 2004, Climate sensitivity of Earth to solar irradiance: update. Physics, abstract physics/0411002.

    figure 2 (page 16 of the pdf) shows the results of a multivariate analysis of volcanics, enso and solar.


    This paper is a continuation of a study by Douglass and Clader. We extend the analysis through December 2003 using the latest updates of the observational temperature and solar irradiance data sets in addition to a new volcano proxy data set. We have re-determined the solar effect on the temperature from satellite measurements of the solar irradiance and the temperature of the lower troposphere the sensitivity to solar irradiance. This re-analysis calculates two newly recognized dynamic and non-radiative flux factors which must be applied to the observed sensitivity. The sensitivity is about twice that expected from a no-feedback Stefan-Boltzmann radiation balance model, which implies positive feedback.

    The sensitivity to volcano forcing is also determined. Preliminary results indicate that negative feedback is present in this case. Response times of fractions of a year are found for both solar and volcano forcing.

    We note that climate models generally assume relaxation times of 5 to 10 years and we comment on the consequences of this large disparity. We also have determined a linear trend in the data.

  9. John Hekman
    Posted Aug 10, 2005 at 11:25 AM | Permalink

    Steve Verdon: regarding your comment that the ARMA process does not yield a good predictor–that is the point. A completely auto-regressive process, which the earth’s temp may in fact be, would not be predictable.
    Steve McIntyre: I think that the high significance you refer to in your post is only half-correct. Many variables have a high r-squared when modeled as ARMA because they are relatively stable. For example, GDP has a very highly significant ARMA process. But at the same time there are economic variables such as investment and labor force that can be significant in explaining some of the movement of GDP. What you need to do is to have AR and MA components in your estimation along with another variable such as CO2 or solar activity.

  10. Steve McIntyre
    Posted Aug 10, 2005 at 12:10 PM | Permalink

    John Hekman, I don’t disagree. This was just a quick little exercise to see if anything was going on. It looks like it might be worth spending time on, though I need another topic like a hole in the head. Steve V. Here’s a URL together with a script in R to collect the data, and do the arima analysis. I’ve added a Durbin-Watson statistic, which is 0.445 (which is hardly surprising given the ARMA (1,0,1) model success.

    url< -";
    N<-length(NH);N #321

    # ar1 ma1
    # 0.9257 -0.3183
    #s.e. 0.0241 0.0571
    #sigma^2 estimated as 0.01217: log likelihood = 250.71, aic = -495.43


    #(Intercept) -24.29584 2.60982 -9.309 <2e-16 ***
    #year 0.01222 0.00131 9.325 <2e-16 ***
    #Signif. codes: 0 `***’ 0.001 `**’ 0.01 `*’ 0.05 `.’ 0.1 ` ‘ 1
    #Residual standard error: 0.1804 on 318 degrees of freedom
    #Multiple R-Squared: 0.2147, Adjusted R-squared: 0.2123
    #F-statistic: 86.96 on 1 and 318 DF, p-value: < 2.2e-16

    logLik(fm) #`log Lik.' 94.98412 (df=3)
    plot.ts(GL,xlim=c(1977,2017),xlab="",ylab="Anomaly deg C")
    lines(seq(1978+11.5/12,2015+6.5/12,1/12),-24.29584+ 0.01222 *seq(1978+11.5/12,2015+6.5/12,1/12),col="blue")

    #DW = 0.4445, p-value < 2.2e-16
    #alternative hypothesis: true autocorrelation is greater than 0

  11. Posted Aug 10, 2005 at 1:05 PM | Permalink

    >To examine the matter from first principles really needs to be done.

    Readers may be interested in the work of econometricians Stern and Kaufmann
    in climate sciences on what I think are similar models, such as:
    Evidence for human influence on climate from hemispheric temperature relations
    RK Kaufmann, DI Stern – Nature, 1997 –
    (search for the authors names in will yield many more).

    I have wondered why this obviously powerful and useful approach was not
    more widely used. Cheers

  12. Posted Aug 10, 2005 at 1:32 PM | Permalink

    Steve Verdon: regarding your comment that the ARMA process does not yield a good predictor–that is the point. A completely auto-regressive process, which the earth’s temp may in fact be, would not be predictable.

    I’m not sure this is the case. For example, I downloaded the CO2 data by Keeling and Whorf and also downloaded some data on the SOI. In looking at the data there appears to be a lag between temperture and the SOI by about 5 months (i.e., a 5 month lag). Both were statisically significant, but the R2 (0.264) was still rather low. Adding other relevant variables (e.g. the volcano eruptions noted by Hans) could help improve the fit as well.

  13. TCO
    Posted Aug 10, 2005 at 5:51 PM | Permalink

    1. What is an ARMA and why do you assume that we know what it is and is it relevant or just looks cute.

    2. The stock market is pretty choppy, but the trend over enough time is pretty close to undeniably up…

  14. Michael Jankowski
    Posted Aug 10, 2005 at 6:15 PM | Permalink

    I think the linear “blue” trend is just a means of comparing the overall trend of the lower troposhere readings vs the surface temperature readings since 1979 since the lower troposphere is supposed to be warming at least as fast as the surface readings. It’s not supposed to be predictive, constant, etc…just a means of comparison.

  15. TCO
    Posted Aug 10, 2005 at 6:44 PM | Permalink

    should the alleged heat island effect also affect the lower atmosphere? Is the alleged difference a result of differnt surface closeness or is it because of placement of thermometers? But wouldn’t that be overcome by area weighting anyway?

  16. Steve McIntyre
    Posted Aug 10, 2005 at 7:19 PM | Permalink

    ARMA means autoregressive-moving average. It’s a technique used in statistical analysis popularized by Box and Jenkins. It’s pretty fundamental. I’ll try to explain stuff from time to time, but often I’m just writing up quick notes and I’ll have to leave some of the sruff pretty terse. If someone’s interested, you can usually google a term that you don’t understand – or ask and I’ll try to answer but no promises. I’m not going to explain ARMA processes. It’s beyond the scope of a blog.

  17. Steve McIntyre
    Posted Aug 10, 2005 at 7:30 PM | Permalink

    Here’s the script for the regression calculation within ARMA:


  18. Peter Hearnden
    Posted Aug 11, 2005 at 12:47 AM | Permalink

    Re #16 ‘I’m not going to explain ARMA processes. It’s beyond the scope of a blog.’. Well, isn’t that just great…You post all this stuff Steve, stuff that is meaningless to 99% of us, and then that’s it? What’s the point? That we take it on trust that you’re right? Yeah, just like you do with MBH I suppose. That you’re obfuscating? That a decade or so of the satellite temperature recording being practically the only thing you sceptics trumpet is, bingo!, reveresed and the sats are meaningless? John Daly most be turning in his grave :(.

  19. David Brewer
    Posted Aug 11, 2005 at 1:35 AM | Permalink

    By George you are on to something!

    Why on earth during all this GW kerfuffle did someone not think of trying a best-fit formula for the satellite temperature curve?

    We all know the independent variables worth trying: GHG load (all gases, Hansen/Sato is a useful series), El Nino (ENSO will do), volcanos and solar. The latter two might be able to be combined in a solar irradiance number at the surface, leaving only three factors.

    We have less than 30 years of satellite data, so the result is going to be subject to a fair bit of revision. But you can see from that you are likely to get very close with just these 3/4 factors and appropriate delays (~6 months) on the non-GHG elements. Another angle might be to use balloon data back to 1958, as I believe Hans Erren may have tried. Surface data are more problematical, and quite useless before about 1920.

    Sorry I don’t have the maths for this, but anyone who does can make their own estimate of the climate sensitivity of CO2, and be practically the first in the field with a serious number based on data. This would be far more useful than the various curve-fitting exercises on both sides of this argument, which in any case never seem to get around to disclosing what climate sensitivities they ended up with.

    Dave Brewer

  20. Larry Huldén
    Posted Aug 11, 2005 at 2:41 AM | Permalink

    RE #18

    Peter Hearnden: “I’m not going to explain ARMA processes. It’s beyond the scope of a blog.’. Well, isn’t that just great…You post all this stuff Steve, stuff that is meaningless to 99% of us”

    Steve does not need to explain everything if the basics can be found from other sources. Steve has explained all he can about hockey stick because the hockey team hides critical basics, data and methods.
    No common sense shows that CO2 is the basic underlying cause of the recent temperature trends. When you reverse the current models you will not find the ice age. So, we still have to make questions.
    As soon as methods are questioned stuffs are not meaningless to 99% us.

  21. Steve McIntyre
    Posted Aug 11, 2005 at 5:45 AM | Permalink

    Peter H, I don’t expect anybody to take anything on trust. I am also not making any strong assertions about satellite trends. I haven’t submitted a paper for publication. I haven’t researched the literature. I specifically asked for references from anyone who had looked at this topic.

    However, the calculations that I did were very curious and I sometimes use the blog to “think out loud”.

    I posted up the computer script for the calculation that I did. I’ve done this for articles, but I noticed that Hans posted up his script for a short calculation and I did so as well. (This is perhaps a form of mutual positive feedback if you will.) If you are interested, then you can easily replicate the calculation. There are many textbooks on ARMA processes – Box and Jenkins is a classic source that I use. As Larry pointed out, this is public domain and readily accessible. But in a rough sense, think in terms of random walks or cumulative gambling outcomes in a repetitive casino (with some moving averaging).

  22. Tom Rees
    Posted Aug 11, 2005 at 6:28 AM | Permalink

    Steve – FWIW shoving a linear trend through these data has no statistical meaning. It’s not a statistical model. It’s just an easy form of data reduction. For detection and attribution studies, what’s done is a process called optimal fingerprinting. This does have statistical meaning, but it’s not something you can do on excel 🙂 Until we have the optimal fingerprint results, we have to talk about something…

  23. Steve McIntyre
    Posted Aug 11, 2005 at 6:41 AM | Permalink

    Tom, IPCC has a very strange section entitled “Optimal Detection is Regression”, which then includes some pretentious linear algebra which seems to describe the headline “Optimal fingrprinting” isn’t anything special. The point that I’m making here is not particular to this particular edition of the satellite data and to any present dispute: it applies to every version of the data, so there’s been ample opportunity to study the issue. As I said, it would surprise me if someone hasn’t looked at the data in this way, since it is such an obvious analysis and simple analysis, but it they haven’t, they haven’t.

    I don’t use excel other than for occasionally editing data. I would encourage you to download R and start using it. Among other skills, it has a truly wonderful ability to download data from ftp sites automatically or with minimal formatting for skip-lines. It can also handle big data sets relatively effortlessly.

  24. Tom Rees
    Posted Aug 11, 2005 at 7:28 AM | Permalink

    Steve – I’ll have a go with R – thanks. I know there’s a large body of publications about climate stats – von Storch authored one last year which you may know about but others might be interested in because it’s relatively accessible: . I don’t pretend to have any expertise tho.

  25. Greg F
    Posted Aug 11, 2005 at 7:39 AM | Permalink

    If you know how to write VBA code Excel becomes infinitely more useful. Large data sets are easily accessible by storing and retrieving them from a database. I have not written anything to access FTP sites but have used Excel to log in and extract data from html pages. Relatively speaking, FTP should be a walk in the park. Can you animate a graph in R? I guess it all depends on what you want to do.

  26. Steve McIntyre
    Posted Aug 11, 2005 at 7:41 AM | Permalink

    Thanks, Tom. I’ve seen the vSZ article. The issues that I have in mind are different.  On your point about "interim" modeling of a stochastic process with a trend, the article by Peter Phillips here, New Tools for Understanding Spurious Regressions touches on the issue from a quite different point of view. You’ll probably find it heavy going, but you’ll see that sometimes things that seem obvious are not so.

  27. Michael Jankowski
    Posted Aug 11, 2005 at 8:03 AM | Permalink

    Well, isn’t that just great…You post all this stuff Steve, stuff that is meaningless to 99% of us, and then that’s it? What’s the point?

    Did you understand “standard centering methods,” principle components, Pres-N, etc, and all of the other methodology that was waved-over you in MBH98? I doubt it. Yet you certainly accepted it (and still seem to, for the most part).

  28. Michael Jankowski
    Posted Aug 11, 2005 at 8:22 AM | Permalink

    The lower troposphere readings of the satellites are supposed to be free of urban heat island effects. And with essentially worldwide spatial coverage, they aren’t as pre-disposed as the surface measurements are to be potentially affected by a disproportionate number of urban readings in the first place. But the AGW crowd insists the surface readings account for urban heating and/or claim it to be negligible, so that wouldn’t explain the difference between the warming rates.

    Current greenhouse theory and GCMs claim that the lower troposphere should warm at least as fast as the air at the surface due to GHGs. It’s not the difference in temps between the sats and surfs that is important (we’d expect them to often be different because they are measuring different altitudes), it’s the difference in the rate of warming between the sats vs surfs that is relevant. Different rates of warming between the lower troposphere and the surface indicate a problem with either one of the measurement records or with greenhouse theory itself.

  29. Greg F
    Posted Aug 11, 2005 at 8:29 AM | Permalink

    Current greenhouse theory and GCMs claim that the lower troposphere should warm at least as fast as the air at the surface due to GHGs.

    Correct me if I am wrong but I think your being far to generous here. It is my understanding that the troposphere should be warming faster. Not even getting the sign right is a pretty good argument for reworking the greenhouse theory.

  30. Steve McIntyre
    Posted Aug 11, 2005 at 8:45 AM | Permalink

    The formulations of AGW by Ramanathan in the 1970s which are the source of the 4 wm-2 figure give this figure for the top-of-atmosphere forcing under 2xCO2 and state that the forcing at surface in the tropics would be less than 1 wm-2, so that any surface heating above this would be by convection/conduction processes. In a “fingerprinting” scheme, one would specifically look for tropical tropospheric warming relative to surface warming as “fingerprint” evidence of AGW as opposed to solar forcing which would work surface out. I once started trying to trace the 4 wm-2 from Ramanathan and compare to the modern “fingerprinting”. It would be a very interesting exercise to finish.

  31. Tom Rees
    Posted Aug 11, 2005 at 9:03 AM | Permalink

    Steve, it seems to me that there are two issues here. The first is, is it getting warmer/colder whatever. The second is, what is the cause. A linear trend gives you an indication of the answer to the first Q – so does just eyeballing the graph (and both approaches probably have as much validity, but at least sticking a line through it gives you a number). The answer to the second may well be autocorrelation, random walks etc. But to get an answer to it, you have to produce a model (statistical or otherwise). The linear regression is not the model – interim or otherwise.

  32. TCO
    Posted Aug 11, 2005 at 9:12 AM | Permalink

    While I’m not a statistician or a person who uses the tools extensively like an econometrician or paleocliamatology meta-analyzer, I have had a semester course in Probablity and Statistics and another semester in DOE. I was able to understand pretty quickly the idea of PC (that it is extracting vectors to describe the parameter space (late electronic nose or such). But I had never heard of ARMA. This is a very slight, stylisitc point, but I think references of such obscurity (same with Preisendorfer) require at least a clause reference or a footnote, if you are not going to spend a couple sentences describing at a concept level (as was done nicely for PC). So I basically agree with Hearndon but more out of “discussion convenience” purposes than some evil keep things hidden feeling.

    Oh…and I DO LIKE it when you insert a little bit of tutorial anyway. And you know if I do engage on the topic, I’m going to end up dragging it in by asking the clarifying questions like which method (ARMA or line) has more degrees of freedom. And for the “what is a global temperature”, I’m still not clear what that means either: land area determined, weighted by heat capacity, etc. etc. On that one, I’m not even asking what SHOULD we do. I’m asking what DO we do!? Amazing that we could that whole pitiful thermo side discussion and not adress this.

    Off topic: should I buy Box, Hunter, Hunter? They talked about it all the time in my DOE class. But our book was much cheaper, slimmer, easier.

  33. TCO
    Posted Aug 11, 2005 at 9:32 AM | Permalink

    Regarding heat island. If the troposphere is expected to mimic the surface, then shouldn’t it mimic heat islands? (unless there is some area weighting effect going on? HAving heat islanded surface stations as proxies for larger areas that lack heat islands).

  34. TCO
    Posted Aug 11, 2005 at 9:35 AM | Permalink

    On the multiple regression, seems like you oght to like el nino is a part of the system and that CO2, solar irradiance, and vulcanism are forcing functions. Also, you can’t leave out vulcanism entirely, but have to assume some average amount of it happening.

  35. John Hekman
    Posted Aug 11, 2005 at 11:07 AM | Permalink

    This has been a good discussion of the lack of statistical studies of sat temperature data and CO2 vs. solar activity. Where it really gets interesting is if you take into account that some researchers believe that solar activity warms the earth, and that causes an increase in CO2. This opposite causation theory has to be tested too. Is anyone here familiar with the work of Theodor Landscheidt, especially the papers archived at John Daly’s site? I like this one:

  36. John Hekman
    Posted Aug 11, 2005 at 11:09 AM | Permalink

    Here is the url:

  37. Michael Jankowski
    Posted Aug 11, 2005 at 11:34 AM | Permalink

    Re#29-I like to think of myself as generous, but I don’t know how true it is! I have seen it presented as “at least as much,” so I’m not sure it necessarily has to be greater in theory. But I do think in the GCMs, the LT warms significantly more than the surface.

    Re#33-As I understand it, the heat generated by the heat island would’ve dispersed (can’t think of a better term) significantly enough that it wouldn’t affect the lower troposphere signal at a localized location (ie, over an urban environment). Yes, it seems this heat would be incorporated into the global average of the satellite measurements, and it’s on a global scale that the LT warming is supposed to exceed surface warming. But as I said, the AGW crowd considers UHIs to be insignificant on a global scale, and I believe some of the surface records attempt to remove UHI effects.

    But…if UHIs are of any significance and if the surface measurements already account for UHI effects, but the satellite measurements are effected but don’t subtract UHI effects, then the difference between surface warming and LT warming is even more perplexing.

  38. Steve McIntyre
    Posted Aug 11, 2005 at 12:55 PM | Permalink

    There are 2 different surface temperature sets and both need to be analyzed. My impression of the surface data set (and it’s just an impression) is that the “rural” stations in the Jones CRU data set can be (say) Indonesian towns of 350,000, growing like weeds. Even small villages can have “UHI”. I think that there’s considerable hair on this dataset. But the sea surface temperature dataset also has to be considered: there you have to sort out bucket adjustments.

  39. Chas
    Posted Aug 11, 2005 at 1:21 PM | Permalink

    Re #23, 33, 37 for the urban or industrial signatures in various datasets
    Have a look at Maurellis:

    Click to access 2003GL019024.pdf

  40. TCO
    Posted Aug 11, 2005 at 1:40 PM | Permalink

    please comment on 32

  41. JerryB
    Posted Aug 11, 2005 at 2:11 PM | Permalink

    In case anybody does not already have enough numbers to ponder, there is a file at:
    which breaks out various portions of the lower troposphere temperature variances from average as follows:

    Globe Land Ocean
    NH Land Ocean
    SH Land Ocean
    Trpcs Land Ocean
    NoExt Land Ocean (Ext > extratropics, beyond 30 degrees latitude)
    SoExt Land Ocean
    NoPol Land Ocean
    SoPol Land Ocean

    Caveat: I would not expect the estimated boundaries between land and ocean, or the perimeter of USA48, to be very precise, and I do not know their boundaries for the poles.

  42. TCO
    Posted Aug 11, 2005 at 2:13 PM | Permalink

    Maurellis seems to be saying that temp rise comes from planet paving-over, not from CO2.

  43. Steve McIntyre
    Posted Aug 11, 2005 at 2:27 PM | Permalink

    Re #32: I don’t know the ARMA texts well enough to comment. I have an edition of Box and Jenkins from the late 1960s when I was at university; it has the definitions but it has a lot that’s not really necessary. There’s probably something on the internet that would be enough – I’ll try to check some time. Another way to approach it (and one that I would recommend in conjunction with anything) is to download R and experiment with the arima function against some datasets that you understand to see what’s going on. Most people have modelled red noise as a rule of thumb with AR1 processes (ARMA (1,0,0)), but I’ve gotten quite intrigued with the different behavior of ARMA (1,0,1) processes in a climate context.

    In terms of degrees of freedom, the linear regression model shown in the graphic has 2 degrees of freedom (intercept and slope) and the ARMA model has two degres of freedom (AR1 coeeficient and MA1 coefficient). You don’t need a Ph.D. in statistics to see which 2-df model works better. I’m going to send this to Spencer and see what he thinkgs.

  44. Michael Jankowski
    Posted Aug 11, 2005 at 2:40 PM | Permalink

    Even small villages can have “UHI”.

    As I’ve demonstrated to my wife a few times, our country house can be a very localized UHI. It’s particularly noticeable after a sunny day turns into a cool night. Walk away from the house quickly, and you can feel the air getting significantly cooler. Walk back towards the house, and it warms up. A poorly-placed weather station can have serious issues. Even a relatively well-placed weather station can have serious issues. Anything closer than maybe 50 feet to the house is noticeably warmer on many nights than at the fringes of the property. The addition of any homes around us, even though we’d stay remote and rural, would almost certainly have a measurable effect on thermometer readings (I assume they are more sensitive to recording temperature changes than I am!). Maybe even just the addition of a shed, a concrete retaining wall, etc, could increase the UHI on our property. Turn a gravel driveway on our or our neighbors’ property into paved asphalt or have the little 2-lane highway outside the house expanded to 4-lanes, and things would probably warm even more on our property. Put a few houses around us/village/town/city/metropolis, and we’d get from a few to several degrees warmer with no help from “enhanced greenhouse.”

    Now maybe some datasets attempt to adjust for urban warming, but they would likely depend on rural measurements for those adjustments. And as I’ve described above, even a very rural location can experience a localized “UHI.”

    I think that there’s considerable hair on this dataset.

    I would agree.

    As it is, the surface-based global average from GISS, CRU, NCDC, etc, often disagree (pretty significantly, IMHO).

    Davey and Pielke have a pretty nice piece here with some of the shortcomings of the US Historical Climate Network . There are tidbits out there raising serious questions about stations around the world, too. I’ve read a few in the past but don’t have any links handy.

  45. Steve McIntyre
    Posted Aug 11, 2005 at 2:44 PM | Permalink

    Re # 6 again:

    Steve V., Hans von Storch has an interesting note in J Clim 1999 12, 3505 about what to do when you want realistic variability in a forecast, but where your predictor has less variability than one reasonably expects in the field. Von Storch criticized a procedure employed by Karl [J Clim 1990] of inflating the variance in the predictors, an approach employed in MBH98 but not reported, and recommended that one simply add white or red noise if you need variance. One needs to distinguish between what a “typical” night at the casino would look like in terms of ups and downs and what the “expected” bankroll is at any given time.

  46. Douglas Hoyt
    Posted Aug 11, 2005 at 3:57 PM | Permalink

    Re: 45: “Even small villages can have “UHI”.”

    I agree. A friend of mine took a thermometer out in a flat desert region and using his thermometer, he could detect the effects of a paved road 500 feet away from the road. You don’t even need a single building, let alone a village for an urban heat island to occur. I doubt that there are many places where thermometers have been placed that have not had a paved road built within 500 feet during the period they have monitored temperature.

  47. John Cross
    Posted Aug 11, 2005 at 4:10 PM | Permalink

    Steve: As you will see from my question I have minimal experience with ARIMA, but my impression is that it is not a useful metric for the dataset we are looking at. I would also be surprised to see if it wasn’t done at some point. Certainly ARMA is used in dendrochronology. However I only offer these comments as personal observations and could not justify either one (at the present time).

    Now to display my ignorance for all to see. Just for clarification, when you talk about an ARMA (1,0,1) are you talking about an ARIMA with the integration coefficient set to zero?


    Steve: I’m just doing ARMA so the middle differencing parameter is irrelevant (d=0), it’s AR1 and MA1. Like you,I would be surprised/dumbfounded if it hadn’t been used at some time. But you never know; it’s best to assume nothing; I’ve obviously had a few surprises in climate science. As to whether it is or isn’t useful, the proof is in the pudding. The two-parameter reconstruction of the satellite series is really quite remarkable in itself, so there’s something going on that warrants investigation, whether or not it yields anything at the end of the day.

    ARMA is not used very much in recent dendrochronology; it used to be done in the 1980s, but now they use “conservative” Jacoby or Briffa RCS methods attempting to “preserve low-frequency” information and they leave the autocorrelation in. There’s no ARMA in any of the dendro series that are important in the big multiproxy reconstructions (e.g. bristlecones, Polar Urals, Tornetrask, Gaspe, Jacoby series, Briffa series). .

  48. TCO
    Posted Aug 11, 2005 at 5:47 PM | Permalink

    I haven’t looked at the reference, yet, but it makes no sense for me for a spky curve to be matched with so few degrees of freedom. I wonder if you are hiding some degrees on me…

  49. Steve McIntyre
    Posted Aug 11, 2005 at 6:04 PM | Permalink

    Re #48: It is counter-intuiitve for someone used to working with continuous series to be able to generate a 2-parameter model like this, but there are only 2-parameters. I posted up the computer script generating the result in #10 above. Try downloading R from and running it for yourself. Just cut the script and paste it onto your console. You can probably get some real output on your screen which looks just like what I posted up in under 5 minutes start to finish – R installation included.

    I thought a bit more about your statistics inquiry. Most statistics texts are not oriented towards autocorrelated time series, but to quite different problems. Because climate series are so autocorrelated, you really need to get your eye in from a different vantage point. A lot of the multiproxy climate statistics seems to me to grab the wrong end of the stick, so to speak.

    A text by Chatfield has been recommended to me, but I haven’t seen it. I got an old text by Wayne A Fuller, Introduction to Statistical Time Series out of the library recently and there was a lot that I liked in it. Some excellent discussion on variance of the mean and autocorrelation and autocovariance, that is missing from regular texts at the same level.

    Most of the work on autocorrelated series is from econometrics. My impression of multiproxy climate science is that their statistics would not meet 1950s standards in econometrics. So you might browse some of the recent econometrics text – I don’t know the texts. I’ll ask Ross for a suggestion.

  50. John S.
    Posted Aug 11, 2005 at 6:28 PM | Permalink

    Looking at my bookshelf right now I could suggest (depending on your level):
    Hamilton, James D. (1994) “Time Series Analysis”, Princeton University Press. (Graduate level text – mathematically precise and comprehensive but hard going.)


    Greene, William H. (1997) “Econometric Analysis”, Prentice Hall. (Also graduate level but a fair bit easier than Hamilton)

    I seem to remember using:

    Johnston, J “Econometric Methods”

    in my undergrad classes which I believe is somewhat of a classic.

  51. John Hekman
    Posted Aug 11, 2005 at 6:42 PM | Permalink

    Right now I’m looking at Pindyck and Rubinfeld, “Econometric Models and Econmic Forecasts”, Fourth Ed., 1998, and the discussion of time series analysis is very clear. Lots of auto-correlation discussion.

  52. Greg F
    Posted Aug 11, 2005 at 7:47 PM | Permalink

    Re # 45

    …and recommended that one simply add white or red noise if you need variance.

    That is effectively what Moberg did in his reconstruction with the tree ring data.

  53. TCO
    Posted Aug 11, 2005 at 8:16 PM | Permalink

    I still wonder if somehow the format of the function itself is hiding a few degrees of freedom.

  54. Peter Hartley
    Posted Aug 11, 2005 at 8:21 PM | Permalink

    I was looking at the data in STATA and got almost the same results as you did for an ARIMA(1,0,1). The AR coefficient is .92089, and the MA one is -.3154068. I then noticed that I inadvertently dropped the 1978-12 observation and this probaly explains the slight difference in results. My calculations suggest, however, that an AR(2) model might be a slightly better fit. It gives AR coefficients of .5845324 and .292017. The two log likelihoods for my data set are 249.9151 for the ARMA and 251.1993 for the AR(2). Just for the fun of it, I also tried an ARMAX model with the Mauna Loa CO2, solar irradiance and cosmic ray flux at Calgary as regressors. None of these were signifiacnt forecasters of the 5.2 version satellite global average at conventional p-values, but including them also made the trend term insignificantly different from zero.

    A brief comment to those who are wondering how the ARMA model can lead to a “jagged” series when the forecast path from the final value is so smooth — the point is that the model also includes a gaussian error term and the smooth response to a given shock is continually “thrown off track” by new shocks (draws from the error process) as time progresses. The forecast path in Steve’s picture is smooth the best forecast of the shock term at some future period is its unconditional mean of zero.

  55. TCO
    Posted Aug 11, 2005 at 8:24 PM | Permalink

    So it’s hiding some degrees of freedom?

  56. Ross McKitrick
    Posted Aug 11, 2005 at 8:47 PM | Permalink

    Re (#11) and others; Stern and Kaufman have done some high-tech time series work, but the Nature paper cited in the post had a problem. They used Granger causality modeling to look for a CO2 effect. “Granger causality” refers to the idea that in forecasting a variable Y, if lagged values of another variable X reduce the mean squared forecast error of Y then X is said to “Granger cause” Y. K&S used a slightly more involved version where X and Y lever off a third variable Z to infer causality, and they concluded CO2 Granger-causes temperature. That was good enough for Nature. However their conclusions were challenged by Umberto Triacca, an econometrician, who showed that the structure of their model yielded coefficients with several possible implications, including (1) that CO2 Granger-causes temperature change and (2) CO2 does not Granger-cause temperature change. His paper appeared in Theoretical and Applied Climatology vol 69, 137-138 (2001). Try this link: The last line in his paper is especially interesting: “This paper was submitted to Nature but rejected.”

  57. Steve McIntyre
    Posted Aug 11, 2005 at 10:19 PM | Permalink

    Peter (Hartley), I’m more intrigued by the ARMA (1,1) model since I’m interested in processes with high AR1 coefficients and in particular by ARMA(1,1) processes with high AR1. I’m not entirely sure where I’m going with this interest, but thre are two recent articles on spurious regression that pertain to it. Ferson et al [2003] have a google-able article in which they demonstrate spurious regression in circumstances where the target series does not have high persistence (which has been a characteristic of spurious regression explanations of the random walk type, e.g. Granger and Newbold [1974], Phillips [1986] and of fractional processes [Tsay and Chung, 2000]). Ferson also has an interesting discussion of the interaction with data mining, pointing out that data mining results in picking of the most persistent series – this article has really impressed me for a long time. Ai Deng [2005] gives an asymptotic explanation for Ferson et al [2003] by suggesting that it has "almost white almost integrated" noise i.e. ARMA (1,1) with an AR1 over 0.9, resulting in sufficient persistence to yield a spurious regression phenomenon but with a disguise from the MA1 component, so that it resists conventional identification tests since the residuals are "almost white", but not. He borrows the residuals structure from Nabeya and Perron [~1993]. Proxies like Gaspe or the North American PC1 have autocorrelation structures that are off the charts. There is a data mining tendency in Mann’s regression module, which we’ve referred to, but not discussed, which I think is along the lines of what Ferson describes. Cheers, Steve

  58. Steve McIntyre
    Posted Aug 11, 2005 at 11:24 PM | Permalink

    Re #55: that’s a good question. I’m pretty sure that the following is correct: if you fit an AR1 process, you do so through a simple linear regression and the standard error is just the usual regression standard and only one degree of freedom is used. The linear algebra will be pretty similar to a trend fit. A AR1+MA1 fit is a little more involved but principles in terms of df will be similar.

    However, the ARMA fit has the previous point available to it (think Markov process). So the previous point has somewhat the same role as the time variable in a trend regression (which need not be uniform). Is that a “hidden degree of freedom”? Not in the usual definition, but the availability of the prior history (one step back) has to be what contributes to the better visual fit.

    I calculated log-likelihoods for both approaches. I’m pretty sure that this statistic will be as close to apples and apples as you can get and the ARM log-likelihood was hugely better than the trend log-likelihood.

  59. Larry Huldén
    Posted Aug 12, 2005 at 12:05 AM | Permalink

    A general comment to heat island effect and urban warming.
    When thinking of an observation site close to a building, it is true that a higher temperature can be detected in relation to a more remote site. In case of climate warming, this could cause measurements which are the same as the real trend or even show a slightly lower warming trend, because the “building” effect would to some extent be cancelled. Actually the higher temperature close to the building is not important in this case as long as the environment is static.
    It is the changing urban environment which is causing an exaggeration of the heating effect. Heating becomes more intensive during the urbanization process and the trend is deviating from the trend in a rural environment. In the above mentioned case of a building effect the difference in trends would be negligable.
    The “urbanization” warming effect is not limited to big urban centers, it occurs anywhere in a growing human settlement.
    We don’t know exactly how much the real warming trend deviates from the urban warming trend. IPCC claims to have compensated for this effect by separating urban and rural sites at a population size of 50,000. The urban effect starts certainly at a much lower population size.

  60. James Lane
    Posted Aug 12, 2005 at 12:37 AM | Permalink

    The UHI effect is also apparent within a urban area. The main reporting point for Sydney is “Observatory Hill”. As suggested by the name, this was the location of the first observatory in Sydney, established in the late 18th century.

    These days, Observatory Hill is immediately adjacent to the approach to the Sydney Harbour Bridge and right up against the tall buildings of the CBD. in February this year, it was announced on the radio that the highest Sydney temperature ever was recorded at Observatory Hill. I live about about 3km from Observatory Hill, it didn’t feel especially warm, and I immediately logged on to the real-time service provided by the Meteorological Bureau. A nearby site on Sydney Harbour was a full 3 degrees celcius cooler than Observatory Hill. The only site anywhere near the temperature recorded at Observatory Hill was Sydney Airport, with its acres of tarmac.

    If you search for a record of Sydney temperatures, the record you get is Observatory Hill.

  61. Doug Z
    Posted Aug 12, 2005 at 12:52 AM | Permalink

    re #18

    Well, isn’t that just great…You post all this stuff Steve, stuff that is meaningless to 99% of us, and then that’s it? What’s the point? That we take it on trust that you’re right? Yeah, just like you do with MBH I suppose.

    Peter, I hope that you’re not serious with that kind of twisted “logic”, but just in case….RTFM! There’s a difference between withholding data and not giving tutorials on subjects about which people could educate themselves if they could be bothered – but I’m sure that you already know that and are once again just trying to divert attention from the real issues.

  62. Peter Hartley
    Posted Aug 12, 2005 at 7:10 AM | Permalink

    A brief comment on #57: Steve, I would have thought that “almost white almost integrated” would require the AR(1) coefficient to be close to 1 and the MA(1) close to -1, so the two terms in the lag operator alomost cancel. If we write the process Xt = rho*Xt-1+Et+theta*Et-1, for Et white noise, it can also be written as (1-rho*L)Xt=(1+theta*L)Et and if rho is approximately -theta then Xt will be close to the white noise Et. In the example in question the MA coefficient of -.315 seems a long way from minus the AR coefficient. Also, it seems to me that internal dynamics in the climate system is likely to lead to an autocorrelated response of temperature to exogenous shocks while an MA shock instead suggests dynamics in the driving processes. Would that physical interpretation of the time series model tend to favor the AR over the MA as a summary description of the data?

  63. TCO
    Posted Aug 12, 2005 at 7:34 AM | Permalink

    One of the references I googled was all about why ARMAs should be used with great hesitation, that often a simple line was more relevant analysis than the ARMA…just interesting…

  64. Steve McIntyre
    Posted Aug 12, 2005 at 7:56 AM | Permalink

    Peter (Ha.), I’ve been intrigued with extreme autocorrelation from the proxy aspect. I’ll post up the ACFs of the NOAMER PC1, the Gaspe tree ring set and some others to give a flavor.

    However, in the satellite data with a trend model, there’s an outright Durbin-Watson failure, so one doesn’t need fancy spurious regression theories.

    I’m intrigued with Ferson and Deng, because they appear to identify some situations with spurious regression that are resistant to Durbin-Watson. Some of the situations that they considered had high theta’s. Obviously low theta is classic spurious regression. The effect that I’ve noticed with series like Tornetrask is that the AR1 coefficient is much higher in a ARMA(1,1) model than in the ARMA(!,0), which is the usual benchmark (if one is used at all) and that the ARMA(1,1) is much better than the ARMA(1,0) model. Thinking out loud, the question is then – maybe moderate theta’s e.g -.3 or maybe -0.6 in combination with a very high AR1 >0.9 or even >0.95, are sufficient to provide DW resistance: maybe you don’t need to be “almost white”, maybe just a little bit.

    TCO, I’m not advocating ARMA as a magic bullet – what’s the reference that you had in mind?

  65. TCO
    Posted Aug 12, 2005 at 8:08 AM | Permalink

    I can’t find it again, don’t remember exact search terms. BTW, I did not find a good overview description of ARMA basics. Many assumed that you knew the basics or that you knew AR and MA to start with. The one that I made offhand reference to was regarding issues in using ARMA, common mistakes etc. I didn’t follow that much of it either. and it was late.

  66. Steve McIntyre
    Posted Aug 12, 2005 at 9:39 AM | Permalink

    Here’s an interesting article which appears germane to these issues: Helle Bunzel & Timothy Vogelsang, 2003. “Powerful Trend Function Tests That are Robust to Strong Serial Correlation with an Application to the Prebisch Singer Hypothesis,” Econometrics . I’m trying to figure out how to implement these and would welcome any thoughts on it.

  67. J. Sperry
    Posted Aug 12, 2005 at 10:38 AM | Permalink

    Re: 44 (somewhat tongue-in-cheek)
    Do your walking experiments include the effect of the rise in body temp on the perception of a change in air temp?

  68. Michael Jankowski
    Posted Aug 18, 2005 at 11:47 AM | Permalink

    Do your walking experiments include the effect of the rise in body temp on the perception of a change in air temp?

    It would have to be considered, of course. I’d prefer to remove human body effects altogether by getting some big grant $$$ to put continuously-monitored weather stations on a grid system…

  69. John Creighton
    Posted Sep 2, 2006 at 3:22 PM | Permalink

    Steve, the results look surprisingly good. I was just looking up satellite data prior to discovering this thread
    and the satellite data appeared to agree closely with the instrumental data, yet I have not seen anyone fit the instrumental data this well. What was your input for the ARMA model and what sampling period did you use (Year month?).

  70. Steve McIntyre
    Posted Sep 2, 2006 at 4:35 PM | Permalink

    Plot Script

  71. John Creighton
    Posted Sep 2, 2006 at 9:07 PM | Permalink

    Thank you for the script. When I first saw the data the fit looked spectacular. I noticed two things that make the fit look less impressive. The first is that the sample time of the ARMA model is monthly as opposed to yearly. Although a good prediction from monthly data may still be impressive the smaller the time interval the easier it is to predict the future data from the past data and consequently the better a low order ARMA model will fit. There is a type of analog to digital converter called a sigma delta converter. When the sampling rate is fast the sigma delta converter only needs a constant function to predicted future values from past values.

    The second thing I notice is in the algorithm is the moving average part is a function of the errors and not known inputs.

    X[t] = a[1]X[t-1] + … + a[p]X[t-p] + e[t] + b[1]e[t-1] + … + b[q]e[t-q]

    I do not know the likelihood function and how it constrains these values of the error. None the less if this algorithm fits the satellite data much better then the instrumental data I will accept the satellite data is better suited for building models. Unfortunately I do not see the same procedure applied to the instrumental data.

  72. John Creighton
    Posted Sep 5, 2006 at 9:46 PM | Permalink

    I was thinking about the figure in the original post and how to compare it with estimates obtained at different sampling rates. One thing I thought of was transforming the estimate obtained as a fit to the annual data to a model that is sampled yearly instead of monthly. Clearly the fit would be considerably worse. I am of yet unsure of the significance of this somewhat obvious observation.

One Trackback

  1. […] data — An archive here.  Esp note here, here, here, here, and […]

%d bloggers like this: