I have difficulty seeing how what Mann and co. do with some of these papers can be called ‘scientific’…..

Vague undefined terms like ‘moderate’ and ‘low-weight’ while the exact recipe is kept deliberately opaque…. And the details turn out to be highly disputable if not risible. This all brings to mind alchemy and astrology, not rigorous science….

]]>Wow, I realize that top chefs can whip up delectable culinary creations without exact numerical recipes, relying a lot upon taste, experience, and intuition…… but science is not cuisine (which is not to say there are no scientific aspects underlying cuisine, of course).

The output of Mann et al too often seem based upon hunches and guesses dressed up in scientific and statistical trimmings. It’s one thing to investigate based upon hunches, hypotheses, and even speculation…. but quite another to present one’s musings as reliable, well-tested “science” (relying in this case what Bradley had referred to as a “crap” proxy series).

]]>The regression F statistic for a simple regression with n observations is (n-1)*R2/(1-R2), and if there is no serial correlation and the errors are normal etc, this has an F(1,n-1) distribution

Correction — in a simple regression (q = 1),

F = (n-2)*R2/(1-R2)

and has an F(1,n-2) distribution, assuming normality, no serial correlation, etc.

Everytime I read one of these threads I think back to Manns own description of his papers as being ‘Skilful’ and cannot stop laughing.

]]>Yang’s series had already been calibrated to temperature by Yang, and hence there was no reason for MJ03 to re-calibrate it.

]]>yes, and additionally they are claiming in the caption that the composite consists of eight proxies. Notice also that Mann does not say in his reply how many proxies were used, but he did attach the Fig1 from the MJ03 paper (which of course has 8 NH proxies marked). Attaching the figure for Bradley seems to indicate that Bradley had not seen the MJ03 draft.

If I was told that 1 of 8 series got “low moderate weight”, I would not expect anything more than 10% weight.

]]>The regression F statistic for a simple regression with n observations is (n-1)*R2/(1-R2), and if there is no serial correlation and the errors are normal etc, this has an F(1,n-1) distribution, so that the significance depends a lot on the sample size. (In a simple regression, the regression F statistic is just the square of the t-stat on the slope, so the two tests are equivalent.)

In some of this literature (eg Thompson CC03, PNAS06), series are inefficiently aggregated with equal weight after scaling, and so this test is relevant for the significance of the aggregate’s correlation with temperature.

However, in JM03, the series are more efficiently aggregated according to their correlation with temperature. This is like multiple regression, in the special case where the regressors (proxies) are uncorrelated.

While this is more efficient, it must be remembered that the R2 of the correlation of temperature with the “predicted temperature” computed from a multiple regression is exactly the same as the R2 of the multiple regression. The significance of the simple correlation between temperature and the compound series therefore must be based on the underlying *multiple,* regression F statistic. When there are q regressors in addition to the constant term, this is F = (n-q-1)*R2/(q*(1-R2)), and has (q,n-q-1) DOF. If R2 is instead treated as if it arose directly from a simple regression, there will be an strong “data mining” or “wheelbarrow” effect that tends to make the correlation seem more significant than it really is.

A similar “data mining” effect arises if insignificant series are simply dropped from the predictive equation after pre-screening (i.e. given a weight of 0). This isn’t necessarily wrong, and may even be reasonable, so long as the omitted series are counted in the “q” of the final regression F statistic. But hardly anyone ever does this!

I must confess that it took me many years of teaching basic econometrics to figure this out — I at first thought that the “wheelbarrow” bias was due to the “-q” in the numerator, but this is minor in comparison to the “q” in the denominator!

(Since temperature is the exogenous variable and the proxies are dependent on it, it would in fact be appropriate to regress “predicted temperature” from the multiple regression of temperature on the proxies on temeperature, and then invert the regression line as in Classical Calibration Estimation. However, even though this beefs up the slope, the simple correlation coefficient and hence the regression F statistic is the same either way and so the significance of the slope is the same either way, abstracting from serial correlation.)

]]>I concur with Richard that by MJ03’s logic, they “should” have adjusted the rather subjective “dof” that went into their weights when they went from 8 series to 6, since deleting Mongolia would somewhat increase the area represented by Yang and deleting W Greenland would somewhat increase the area attributable to GRIP and Dye3. Yang already represents 9 sites scattered across China, whence its big “dof”.

But Jean’s point is not that they weighted correctly or incorrectly, but just that given how they did weight they ended up with 30% Yang despite Bradley’s misgivings about that series. 30% for 1 of 6 series can hardly be called “moderate”.

]]>