w.

]]>I guess my question at this point is why argue the more sophisticated statistical analytical points when the lower level stuff says stop?

If the conventional statistical analysis gives RE of -18, you have to invent something better. (even though RE is HT accuracy measure, not conventional (IMO))

the use of CVM to average out white and other noises

CVM does not average out the noise, it scales the noise so it looks like signal. Sorry, I’m provocative and maybe even cynical now.

]]>First and foremost, and obviously from a less sophisticated understanding of time series analysis, what I see are correlations reported for many of these reconstructions from the calibration period of around 0.45, and without further analysis and adjustment of this number, we are looking at regressions where temperature explains approximately 20% of the proxy’s measured response. That means that we have approximately 80% variation in response that is unexplained. Now it is simple matter for me to understand that if this other 80% is like unvarying white noise and continues that way back into the entire proxy estimation period, we can obtain a reasonable estimation of historical temperatures, providing further that the response of the proxy remains the same for temperatures over the entire range of real reconstruction temperatures. One need be more concerned, I would image, the greater the amount of unexplained proxy response comes out of the calibration process and 80% seems at lot to me.

The next step is that of verification, which to my understanding is important, if for no other reason than the one Steve M often mentions, and that it is to indicate whether the selection of proxies for the calibration was over fit, i.e. the regression will perform in the period from which it was selected, but no so well in other periods where no prior fitting was done. When these regressions show significantly lower R^2 values in the verification process than in the calibration process, as apparently they do in many cases in the reconstructions, that in my view is a show stopper right there. I guess my question at this point is why argue the more sophisticated statistical analytical points when the lower level stuff says stop?

The Hockey Team puts forth, in my view anyway, rather esoteric rationalizations and much effort at this point with references to lower frequency responses from the proxies that the regression R^2 misses, the use of CVM to average out white and other noises, using even unexplained, much less unsubstantiated, teleconnections of temperatures away from local responses towards extended regional ones and even resorting to computer calculations with pseudo-proxies to justify results that still do not overcome the original assumptions about the unexplained noises that must cancel/average out, or at least tend to, over numerous proxies.

Would not a less hurried Hockey team concentrate efforts on a better understanding of all that noise in the signal and a basic biological/physical understanding of the proxy responses to temperature and other potentially involved variables? Add to my puzzle the residual graphs that Steve M recently published at CA showing a less than random, or even recognizable pattern, in the residuals (noise) between the instrumental and proxy temperature responses in the Union reconstructions, and I have to ask whether I am missing some more subtle aspects of the analyses of these reconstructions (from my less than sophisticated comprehension of the analyses) that might not change my view of the results, but at least explain more of the direction that the Hockey Team has taken.

Also I must ask whether econometric analyses, both good and bad, could be presented here to show more acceptable (in comparison to the reconstructions by the HT) statistical analyses of time series and critique the less than optimum approaches?

]]>I am glad that this discussion has moved towards demanding an explicit description and justification of the HT’s data model. The HT’s opaque statistical maipulations are certainly suspicious, but I think the discussion should focus on making them justify that what they are doing develops the true signal. Scientists should have an obligation to do nothing less. ]]>

It’s not really about leptokurtosis.

Why does everyone call it leptokurtosis ? Lepto usually means small – you would have thought high values would be called baryokurtosis or some such.

Love the Keynes quote.

In my field (feedback control systems) my colleagues who have used Kalman filters say they can work well when you have a good (linear) model of the system, even in the face of changes in the system and noise parameters (the whole point of KF), but if you don’t have a decent model, it quickly goes all to hell, and produces worse results than much simpler techniques. Caveat emptor!

Kalman Filter works well when the assumptions are satisfied (applied optimal estimation is something HT should look into, BTW). But in the climate case, only 1000 years or so, it is easier to compute everything in batch mode. I don’t think recursive solutions are needed. In addition to the noise model problem (..HT can’t admit that proxy noise variance is close to ..) But another problem is the dynamic model for global temperature process. Underestimation of autocorrelation leads to underestimation of temperature variability, as shown in here:

http://www.geocities.com/uc_edit/ar1/estimation.html

I originally thought this was the problem of Mann’s reconstructions. But here I have learned that it is not the dynamic model, but general overfitting using multivariate calibration and additional scaling steps.

Evidently, it was well known by the time they got the prize that this assumption was completely invalid, that the distributions had far “fatter” tails than Gaussian, and that this is what led to the downfall of LTCM.

Fat-tail distributions are very challenging indeed. The estimator can be BLUE and still the designer needs some additional outlier-search procedure..

]]>With respect to LTCM: Not exactly.

It’s not really about leptokurtosis. It’s about making bets bigger than you can afford to carry until they pay off. The Wikipedia article gives a nice summary with the key quote “The market can stay irrational longer than you can stay solvent”.

]]>In my field (feedback control systems) my colleagues who have used Kalman filters say they can work well when you have a good (linear) model of the system, even in the face of changes in the system and noise parameters (the whole point of KF), but if you don’t have a decent model, it quickly goes all to hell, and produces worse results than much simpler techniques. Caveat emptor!

When I was in grad school, virtually every lecture I got on techniques such as this started with the sentence, “First, we assume a Gaussian noise distribution…” Then after an hour of math, they never got back to examining the validity of the assumption. In the real world, I have virtually never run into noise distributions in my field that were remotely Gaussian.

I read a little while back that the Black-Scholes economic model that won these guys the Nobel Memorial Prize for Economics and that was the basis for the techniques of the Long Term Capital Management hedge fund was all based on assumptions of Gaussian distribution of the magnitude of economic “disturbances”. Evidently, it was well known by the time they got the prize that this assumption was completely invalid, that the distributions had far “fatter” tails than Gaussian, and that this is what led to the downfall of LTCM.

]]>I will try to have a look at how Kalman filter methodology can be applied here .

You will still need measurement noise model (specially the R matrix). And we are again back in #167. I’m afraid that there are no shortcuts.

]]>Ok, I think this is a reasonable suggestion. I will try to have a look at how Kalman filter methodology can be applied here . There are not many models of tree-ring growth that I am aware of. In that paper they reach the conclusion that the model produces growth series that are correlated with the true series with correlations that are not higher thn those obtained directly to local temperature, so some progress is still needed.

the pseuso-proxies do not neccesarily assume white noise. You can put whatever noise you think it is reasonable. So far I have tried with red noise and long-term persistence (LPT, Hurst series) noise, and assuming different levels of noise contamination. In all cases the CVM method is the best of all three, and it is quite robust to all these different parameters. However, it is indeed not perfect.

I have already commented somewhere else here about alarmist langauge, and others have also published on this in other media.

]]>