I fail to see how my question/request wasn’t “serious,” relevant, and important and wouldn’t have provided “something useful” for people here to learn.

]]>I thought you were presenting an exercise in linear regression for the “cheerleaders” here. It would be shameful to ignore something like instrumental error in such a case. After all, it’s so improper to present the value of a trendline without including its error, even during casual conversation. So why would it be acceptable to ignore any portion of the error in the statistical analysis?

Your work with Tuvalu is certainly not representative of all environmental measurement. And regardless of how frequently environmental variability is much larger than instrumental error, instrumental error needs to be accounted for without simply being disregarded. And I thought we were talking about generating statistically significant trendlines with noisy data in a general sense, not specifically Tuvalu or necessarily an instance where instrumental error was relatively insignificant. Maybe this isn’t very applicable with Tuvalu or even the satellite measurements, but I would say it’s quite important when you’re talking about something like tree ring proxies from hundreds of years ago.

Thank you for granting me the permission to do your exercises. As I said in another post, I’ve done thousands of linear regressions in my lifetime and don’t need a refresher course. I simply wanted to see how you specifically treated instrumental error and have it demonstrated for the cheerleaders. I apologize for making such a demanding request. Clearly, your time is better served arguing over the application of terms like “after-market” to climate change and when one can properly use phrases such as “more advanced than a 2nd year or 3rd year essay” and “beyond undergraduate degree.”

]]>My statement in posting #77 that “in general, environmental variability has a far greater effect on the final uncertainty than instrumental errors” is quite obviously true in many instances. For example, take a look at the sea level record for Tuvalu at:

http://www.antcrc.utas.edu.au/~johunter/tuvalu.pdf

Figure 5 shows that the monthly sea level has as environmental variability with an amplitude of at least 10 cm and a correlation scale of years. However, the instrumental error for a modern tide gauge such as this is down at the 1 cm level and the correlation scale is considerably less. The environmental variability is clearly dominant. But don’t take my word for it — do the numerical experiment by synthesising records that statistically look like the Tuvalu record — that’s what posting #66 was all about.

]]>I’m still waiting for the calculated response(s) from you. To just claim that it doesn’t matter much doesn’t hold much water without a statistical demonstration to justify it (interesting: dismissing a statistical analysis as part of a statistical demonstration). The +/-1.0 spec I suggested could be ignored, as it would seem almost impossible for such an instrument to produce readings limited to the -0.75 to +0.75 range over such a large number of readings. I suggest revising them to +/- 0.10, 0.25, and 0.50.

Maybe a better question would be to have asked what the maximum instrumental error could be for the trend to remain significantly different from zero at various confidence intervals.

I hope to see something by the end of the week, although it really shouldn’t take much time at all for a veteran such as yourself. I think it would be a tremendous addition to your Lin Reg 101 (or 102, if you feel it is that advanced).

]]>**Steve: ** See http://www.climateaudit.org/?p=300 for ARMA coefficients. I’ll post up sd later. Remind me if I forget.

For those who have completed Linear Regression 101 (posting #66), here is the next assignment: ðŸ™‚

Do the recipe of posting #66 for a range of specified trends (e.g. -0.003, -0.002, -0.001, -0.0005, 0, 0.0005, 0.001, 0.002, 0.003) but with the other constants as defined in the recipe. For each of these specified trends produce plots of, say, 10 different time series (each of which is the result of a different set of random noise). Select plots randomly from this total of 90 plots and show them to an observer, asking them the question “what is the sign of the trend in this plot?”. From this, derive the success rate of the observers for different “signal to noise” ratios (i.e. different values of the magnitude of the prescribed trend). Now do the same experiment except, instead of using real observers, use linear regression to estimate the sign of the trend (i.e. use the results of step (4) of the recipe).

Which is more successful at detecting the correct sign of the trend — the visual observer or linear regression?

The purpose of this exercise is to illustrate how a linear regression is a much more sensitive way of determining a trend than visual inspection of a plot.

]]>I disagree. I have indicated (and hopefully clarified in #89) how to derive the statistical distribution of the trend from synthesised data which uses a prescribed trend and autocorrelated noise. From that distribution you can easily calculate the probability of getting a false positive. For example, taking the case described in #89 and #90, if the trend is really exactly zero, then the probability of deriving a trend greater than 0.001 (one possible definition of a “false positive”) is about 2 percent.

It is easy to make linear regression sound overcomplicated and I think much of your article at the beginning of this thread does just that. A regression trend, in it’s simplest form, is just a weighted average of the input data (having first removed the simple average from the input data), where the weights are just a linear trend (negative at the start of the record and positive at the end). So if you know the statistics of the input data, it is just “first year” statistics to derive the statistics of the trend.

So if you think the statistics of an average are simple, then so too are the statistics of a regression trend. If, on the other hand, you think the statistics of an average are complicated, then so too are the statistics of a regression trend.

As regards the quote from Wunch, you should note that he refers to data that “appears visually interesting” — the pitfalls inherent in making VISUAL inferences from time series are what my series of postings is really all about — see a following posting on this.

]]>The “Comment Submitter” seems to make a stuff-up when I input a table which includes percentages (or perhaps “less than” or “greater than” symbols), so here is the table at the end of posting #89 again:

less than -0.001: 2.7 percent (2.3 percent)

less than -0.0005: 17.5 percent (15.9 percent)

greater than 0.0005: 14.5 percent (15.9 percent)

greater than 0.001: 2.0 percent (2.3 percent)

]]>For the first experiment, when you applied a trend of 0.001, linear regression returned trends distributed as 0.001 Â± 0.0005 (standard deviation) — i.e. the same result that I found.

If you had, instead, applied a zero trend to the “first” experiment, you would have found that linear regression returned trends distributed as 0.000 Â± 0.0005 (standard deviation) — i.e. the spread of the trends would have been the same as the spread for the original “first” experiment. This is an important result and perhaps one that I should have indicated earlier — that the spread of the regression trend depends on the variance and temporal scale of the noise and not on the specified trend.

Now, as I indicated in #84, your “second” experiment is not really different from your “first” experiment, except that the specified trend was different. Therefore the spread of trends should be the same (although I note that you say “I calculated a number (twenty or so) of unbiased runs extended out to ….. 250 years of running averages” — which means that some of your records must overlap and hence are not independent). I generated 1000 realisations of 25 year records with zero specified trend, and found the following distribution of regression trends (the figures in brackets are the theoretical values for a standard deviation of 0.0005 and a normal distribution):

0.0005 14.5% (15.9%)

> 0.001 2.0% (2.3%)

which agree well with the theoretical values.

If you didn’t get results similar to the above, then I suspect you did something wrong.

]]>