a zero r1 from the residuals in a trendline regression indicates modest positive serial correlation in the errors. But it won’t be significant, so you’re justified in just ignoring the problem in most cases.

Yes. That’s what I get. I get negative serial autocorrelation.

When I asked my question, “you” referred specifically to *Kenneth*. He’s trying to explain some specific results he got using the Santer tropospheric data and I’m trying to do comparable synthetic runs to see what I find end up being the type I and type II errors for his method overall.

STHTWLSFGJKKMNSSW, aka Santer08, data.

In fact, as I show in my recent working paper

Your working paper would appear very lonely compared to the 17 authored Santer et al. (2008).

I have been attempting to better understand ARIMA and ARMA models so that I can properly simulate some temperature series. I found this link to be useful for me:

http://www.duke.edu/~rnau/411arim.htm

On my initial simulations (with large AR1 correlations) I was seeing significant auto correlation of residuals out to very high orders (greater than AR(10)). Then I discovered the pacf function in R to go with the acf and quickly discovered that it was AR(1) that was contributing to all those higher order correlations. Now watch me abuse the pacf function.

]]>Now, for the annual average data, I didn’t quite know what to do to figure out the type I error for what you actually do. The average lag 1 autocorrelation based on 8 years worth of data was -0.253. For most cases, the lag 1 autocorrelation would not be significant. However, for some it would. So, when you apply your method, do you apply the Nychka correction based on the observed correlation? Or not?

The traditional Durbin-Watson protocol is to ignore negative serial correlation in the residuals and to only worry about positive serial correlation. For this reason, the DW tables are 1-tailed with an null of zero SC and an alternative of positive SC.

The reason for this is presumably that positive serial correlation in the errors typically makes OLS t-statistics overstated. Good statisticians never want to overstate their results, but can live with understated results.

Furthermore, if there is zero serial correlation in the errors themselves, the residuals of a trend regression will be expected to have a negative first order serial correlation, especially when the sample is small, as shown in your simulation with 8 years of data, or 30 years as in the STHTWLSFGJKKMNSSW, aka Santer08, data.

In fact, as I show in my recent working paper at http://www.econ.ohio-state.edu/jhm/papers/MomentRatioEstimator.pdf, a zero r1 from the residuals in a trendline regression indicates modest *positive* serial correlation in the errors. But it won’t be significant, so you’re justified in just ignoring the problem in most cases.

A “month” is not a period that has any obvious meaning.

Tell that to any female member of the family.

Actually the Q1-Q4 that economists employ might work better for climate science.

]]>Hmmm, interesting point. A “month” is not a period that has any obvious meaning. As you observe, monthly AR1 must come somehow from daily and annual structures.

It’s ironic that modern climate science applies a period preserved from the Babylonians or Assyrians or even earlier. Perhaps in addition to her other attributes, Ishtar is the goddess of AR1.

]]>We had a frost last Sunday. Fortunately, I covered my annuals in pots, and had deferred planting other annuals. I did sprinkle basil seeds the day after the frost. I have extra packages of seeds, so I figure it’s worth the risk to sprinkle 1 set.

Yes. You do need to compensate for the autocorrelation if you use monthly data (or any data that is correlated.) There might be some combination of spectral properties, averaging windows and amounts of data that sometimes make using annual average better. I tend to doubt it– but I haven’t actually tested anything more than white noise (where monthly *always* beats annual average) and the case where the monthly data is AR(1) with lag1 correlations about what we are seeing, and the length of data sets I’ve been using.

My motivation in doing this was that people (particularly those who don’t like what I find) frequently suggest that somehow I’m picking monthly because i “like” the results or I’m picking OLS when in fact some other idiosyncratic method they dreamed up would be “better”. Anyway, I wanted to have a specific criteria to select a method that is suggested vs. what I am currently doing. So, my method is: If both are applied at the same confidence level, pick the method with lower type II error. Then, I figure out of two suggested methods that a) use the same assumptions about the noise and b) use the same confidence level, which has lower type II errors.

Monthly has been winning out when I’ve checked which is better *given* a set of assumptions. I’ve gotten monthly is better ‘every’ time I’ve checked. (But ‘every’ means very few cases.) I anticipated this answer–but if someone found something else with other assumptions, I’d believe them. (Then, I’d rerun the numbers. But that’s me. ðŸ™‚ )

I was attempting a thought experiment while planting annuals and it will probably reinforce the fact that I am a better gardener than would-be-statistician. It had to do with the end point data on a monthly series. I then kind of got into an argument with myself and decided that it would be an opportune time to use some R code to do some synthetic analyses with monthly versus annual time series. It is discussions like this one that provide me the incentive to look deeper and with more detail into these statistical choices.

I agree that doing the analyses by differences, where practical, is the preferred method of comparison and can make determining statistical significance less complicated.

My only problem with using monthly data lies in the extreme case where if one has a time series (or other series, for that matter) where subdividing the data only provides more degrees of freedom and one does not attempt to compensate for auto correlation of the residuals, one would with high probability be struck dead by the Stats gods. In this extreme case (not saying it can be related directly to temperature series) I am thinking that the end to start of the subdivided parts would cause problems for an AR1 adjustment.

The critical question about using annual versus monthly (if one assume reasonably good adjustment for auto correlation of the monthly data) then becomes one of expectations of temperature trend changes occurring on an annual or monthly basis â€“ or, putting both of these propositions in question, if the occurrences are on a decadal or longer basis.

]]>Look at how your AR models perform within a given series. I bet the estimated AR coefficients are not stable through time.

I’m absolutely sure they will be unstable. If nothing else, the assumption that the series over the 20th century is a linear trend + noise is poop. It’s known to be poop by everyone. So, in that context, I restrict my choices to: Given that we are making assumption “X”, which method would be best if assumption “X” was true.

Does my doing this necessarily mean I believe “X” true? Not necessarily. But sometimes it’s worth making assumptions for the purpose of trying to pin something down. I don’t know how this fits into the way real statisticians approach problems. But I guess being an engineer, I don’t see it as much different from “If the thermocouple reached steady state, what would the temperature be?” “If we account for phenonma A, B and C, ignoring D on the evolution of temperatures for the thermistor, what do we discover?” We can get *that* answer and then later investigate the effect of “D”. Doing the analysis ignoring D does not mean that we absolutely, positively are sure D should be ignored. It only means we want to know the answer under a certain specific set of assumptions.

I’m not suggesting that AR(1) is “the truth”. Ken asked why I pick monthly data rather than annual averaged data when doing analyses. I pick it because I think that, all other things being equal, using monthly data rather than averaging first to create annual average data, results in tests wtih greater statistical power.

On the one hand we can’t know the true statistical model. But we can assume one and, given the assumed statistical models, determine which method has greater power. I told Ken I checked for two cases: Linear trend with white noise and linear trend with AR(1). In principle, I could check with any number of statistical models. But… well, that would be time consuming. But, it does seem to me that *if* to do an analysis, we make an assumption about the statistical properties of the noise, then, we should select the use of monthly rather than annual average data based on which which method has greater statistical power *under the assumption already made*.

Ken… Roger Sr. suggested a reference for a manuscript I put together. And…. I’m going to look at something. But, Tom Wigley has a section discussing what to do when the “noise” in two separate time series are autocorrelated. You don’t treat them the way Santer treated them. And guess what? The “noise” from Pinatuo (and others) a) must be autocorrelated across runs of models and b) it must be correlated with the earth’s data series.

So, I’m going to do create a time series by creating the multi-model average over all runs and subtracting observations. The do method of Nychka on *that*.

What you are testing here is if the differene has a trend. And, if I’m imterpreting the Wigley paper, this is the right hting to do. (His discussion relates to comparing UAH to RSS.)

]]>