Spurious #5: Variance of Autocorrelated Processes

What is the standard deviation (variance) of an autocorrelated series? Sounds like an easy question, but it isn’t. This issue turns out to affect the spurious regression problem, so I’m posting up a short note on the problem. These issues are well-known in econometrics, where they have led to “heteroskedastic-autocorrelation consistent” estimators. There’s an interesting discussion in Percival [1993], who’s involved in climate, who says:

the effect of not knowning the process mean can be rather large: whereas s is an unbiased estimator of àƒ?à†’€™ when µ is known, the commonly used s can severely underestimate àƒ?à†’€™ when µ is unknown.

Percival gives an example where small-sample methods are biased by a factor of 100!

The caption to Figure 2 of Percival [1993] reports that the variance of an autoregressive process with àƒ?à=0.997 is 166.9. This can be derived as follows.

An autoregressive process is defined by:
(1) y[t] = àƒ?à * y[t-1]+ àƒÅ½à‚Ⳡ[t], where àƒÅ½à‚Ⳡis the error-generating process (“innovations”) with mean 0, assumed independent with standrd deviation àƒ?à†’€™[àƒÅ½à‚ⴝ.

Then the value at time t can be represented in terms of the innovations as follows:
(2) y[t] = àƒÅ½à‚⡠(àƒ?à^j ) * àƒÅ½à‚ⴛt-j], where j=0:àƒ⣃ ‹’€ à…⼍

The variance of y[t] is generated by the inner product as follows:
(3) var (y[t]) = < àƒÅ½à‚⡠(àƒ?à^j ) * àƒÅ½à‚ⴛt-j], àƒÅ½à‚⡠(àƒ?à^j ) * àƒÅ½à‚ⴛt-j]>
= àƒÅ½à‚⡠(àƒ?à^2j) < àƒÅ½à‚ⴛt-j], àƒÅ½à‚ⴛt-j]>
= àƒ?à†’€™[àƒÅ½à‚ⴝ^2 * àƒÅ½à‚⡠(àƒ?à^2j)
= àƒ?à†’€™[e]^2/ (1- àƒ?à^2)

Percival illustrates the very serious finite sample problems of variance estimation in autoregressive processes when you don’t know the true mean as follows in connection with his Figure 2 (s being the estimate of the standard deviation and àƒ?à†’€™ being the true standard deviation of the process (notation slightly different for typology reasons):

For sample size N = 10, we have E{sàƒ’€¹”‚¬➲/àƒ?à†’€™^2} = 0.01; i.e., the sample variance is biased by a factor of 100. The two thin vertical lines enclose one such sample of 10 points, for which s^2= 0.7 (well below àƒ?à†’€™^2 = 166.9). If we sweep across the time series plotted above and compute s^2 for all possible 991 subseries of size 10, we find that s^2 varies from 0.1 to 17.1 and has an average value of 1.8 (again well below the true variance). On the other hand, if we compute s^2 based upon the known process mean of zero, we find that s^2 varies from 0.6 to 764.0 and has an average value of 184.1, which is rather close to àƒ?à†’€™^2= 166.9.

Percival [1993] Figure 2. The jagged curve shows a portion of size 1000 of a realization from the stationary zero mean AR(1) process X[t] = 0.997X[tàƒ⣃ ‹’€ ”‚¬’„¢1] + àƒÅ½à‚ⴛt], where àƒÅ½à‚ⴛt] is a white noise process with zero mean and unit variance. The variance of X[t] is àƒ?à†’€™^2 = 166.9, and the two thin horizontal lines mark ±àƒ?à†’€™ = ±12.9.

Percival reports a short theorem stating that the situation can get arbitrarily bad, as follows:

For every sample size N àƒ⣃¢’‚¬°à‚⣠1 and every àƒÅ½à‚Ⳡ> 0, there exists a stationary process such that E{sàƒ’€¹”‚¬➲/àƒ?à†’€™^2} < àƒÅ½à‚Ⳡ.

I’m going to come back to these issues when I discuss why “scaling” autoregressive processes in a short calibration period (as recently argued by Hockey Team members Briffa and Esper) does not avoid the need to examine statistical issues and is thereby not a magic bullet. You can also picture why a certain amount of care needs to be taken if you are standardizing by dividing by calibration-period standard deviation or even calculating correlations (which include standard deviation terms). Simply transposing techniques from independent samples can be a dangerous recipe.

D. B. Percival (1993), `Three Curious Properties of the Sample Variance and Autocovariance for Stationary Processes with Unknown Mean,’ The American Statistician, 47, no. 4, pp. 274-6. http://faculty.washington.edu/dbp/PDFFILES/three-curiosities.pdf


  1. Armand MacMurray
    Posted Sep 5, 2005 at 1:13 PM | Permalink

    Typo: The estimator symbols, s^’0 and s^0, got left out of your first Percival quote.

  2. Louis Hissink
    Posted Sep 5, 2005 at 3:12 PM | Permalink

    I always become suspicious when complex statistical techniques are needed to extract “facts” from data that on first examination are rather uniformative – almost as if the a theory was proposed and the data then massaged to support the theory.

    As for your initial question, I just wonder whether these issues arise because statistics are being extended to problems that can’t be subject to this type of analysis in the first place.

    Your forthcoming explanation should be interesting.

  3. TCO
    Posted Sep 5, 2005 at 5:25 PM | Permalink

    damn, just read your bio. You are a stud. First in class in pure maths. Squash world champion. Chavez coup experiencer. Sheesh.

%d bloggers like this: