A couple of days ago, I pointed out that the satellite GLB series could be modeled very well as a ARMA (1,1) model with parameters of AR1 = 0.9215 and MA1= -0.3185. I’ve gotten increasingly interested in ARMA (1,1) models with very high (>0.9 AR1 coefficients.) Vogelsang  has some tables showing some very unexpected tendencies for ARMA processes in this parameter range to produce spurious trends.
Vogelsang is a very sophisticated econometrics author. Vogelsang , "Trend Function Hypothesis Testing in the Presence of Serial Correlation", Econometrica 66, 123-146, stated:
"It is shown that OLS-based Wald statistics [e.g. t-statistics] suffer from substantial finite size distortions".
The t-statistic in a canned linear regression is an example of an "OLS-based Wald statistic". Thus, in the case at hand, a regression of the satellite data against time, one gets a seemingly significant t-statistic (as shown in the summary of the simple trend regression below):
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) -24.29584 2.60982 -9.309 <2e-16 ***
#year 0.01222 0.00131 9.325 <2e-16 ***
I pointed out before that the Durbin-Watson of this statistic is 0.45, showing mis-specification. A Durbin-Watson statistic really only tests against an AR1 error structure and doesn’t appear to be particularly powerful against an ARMA (1,1) error structure, although in this case, the AR1 component is so strong (>0.92) that the Durbin-Watson picks up the mis-specification.
Vogelsang did a lot of simulations with ARMA (1,1) models to determine how often various test statistics (including the OLS t-statistic) spuriously reported trend statistical significance. His Table 1 considers AR1 coefficients of 0.8; 0.9; 0.95 and 1.0 combined with MA1 coefficients of -1; -.8; -.4; 0, 0.4 and 0.8. and for run lengths of 100, 250 and 500. In each case, the WORST performance came with an MA1 coefficient of -0.4 (very close to the MA1 coefficient here of -0.32). On the other hand, the performance deteriorated as the AR1 coefficient increased, all the way up to 1.0.
The probability of spurious trend identification through a t-statistic for a trend of 321 measurements with an AR1 coefficient of 0.92 and MA1 coefficient of -0.4 is about 34%, instead of 5%. (See Vogelsang Table 1). Because there is a maximum for the MA1 coefficient in the interval [-0.4, 0], the interpolation is a little less precise than one would like, but one could safely say that it would be at least 30% for an AR1 coefficient of 0.92 and MA1 coefficient of -0.4. Vogelsang proposed some other statistics, which I’m trying to figure out.
Anyway, for some strange reason, the ARMA(1,1) structure of the satellite data appears to be almost ideally suited to producing spurious trends for time series of 250-500 measurements. This is not say that the observations are inconsistent with the stated trend; just that they do not appear to be inconsistent with no trend either.