This is a very pretty example, though the problem is endemic as Mann et al 2008 uses Mannomatic methods for industrial strength voodoo correlations.
This one came up from trying to replicate Mannian confidence intervals from original data – an effort which promptly foundered because my CPS emulation, which, after much effort, finally worked on the 1850-1995 period, immediately foundered when I tried with the “late-miss” (1850-1949) and “early-miss” (1896-1995) calibrations. So back to step by step reconciliation, obviously a lot easier since a lot of modules are working. UC saved the Matlab intermediates so we had something to work with.
Here’s what happened.
In making my “late-miss” emulation for AD1000, I got all the “right” proxies and the calculations all worked (with a little editing.) But the orientation of the Socotra dO18 speleothem was inverted in the late-miss version (this is the speleothem series that we looked at in some detail a month ago, but that’s just a coincidence.)
Huh?? Why would the orientation be right in one emulation and wrong in another emulation? The answer was very timely in terms of “voodoo correlations”.
The “low-frequency” correlation was 0.476 for the 1850-1995 period: a “significant” correlation. The “low-frequency” correlation for the subperiod 1850-1949 was also “significant”: but it was –0.580.
In my initial attempt to emulate Mann’s “late-miss” recon, I had oriented the series using the sign of the correlation for the 1850-1995 period. Silly me. It appears that Mann’s alignment is so opportunistic that the same series is used in different orientations depending on whether it is a “late miss” or “early miss” version. Industrial strength Mannomatic.
Ooga in:
ooga=load(strcat(‘cps_results/’,upper(‘NH’),’_ea100′,expe,’_newgrid_’,iseries,’_gbeachyear.mat’));
Chaka out.
54 Comments
Wow. Too bad that didn’t fit into 250 words.
Interesting. Just to clarify – you’re trying to reverse-engineer Mann’s GW chart and projections, because he won’t release the model himself?
Where is Julien Emile-Geay when you need him? Perhaps he can explain all these “sophisticated” Mannian methodologies.
#2. NO. Mann has done a semi-decent job of archiving code. That’s how UC got the intermediates to see what was happening. But the code is horrendously written and lots of pointless to-and-fro, so it takes time to put into sensible terms and then to figure out what it all means statistically.
#1. Jeff, it’s worse than I thought. Out of 1209 proxies, 308 have opposite “low frequency” orientations between early-miss and late-miss. Presumably many are “significant” in both directions. What a mess. YEah, I wish I’d noticed this earlier. But as you well know, it’s a laboratory of horrors and hard to know where to even bengin.
Re: Steve McIntyre (#5)
Oh my.
Tell knew that there was no other path for Gessler to take – Mann simply diverted all paths to lead through the hollow way he had choosen to camp out at a long time ago.
Re: Steve McIntyre (#5), “Out of 1209 proxies, 308 have opposite “low frequency” orientations between early-miss and late-miss.”
Ouch! Even RC should have a difficult time defending that study. Having Tiljander used upside down was a great example of confirmation bias. But having over 300 series used with differing orientations goes way beyond “sloppy”. Whether it’s incompetence or laziness (or both), it’s difficult to say. It cannot be deliberate, because if it were, the researcher would know how profoundly inept that would make him look were it to be exposed. It is indeed too bad this wasn’t discovered before Steve’s comments to PNAS were accepted.
Re: MikeU (#40),
RC’s reaction should be interesting if they say anything. Since I can’t even comment there after asking them politely to rationalize removing data by correlation — RC gave a big SNIP to that question. I had no understanding of how over the top those guys were at the time or the question would never have been asked. It’s just data folks!
Maybe a post on some of the early RC comments and some from the paper contrasted with reality would be fun on The Air Vent. I’m still waiting patiently for the detail from Steve’s latest quantum multi-state anti-proxy discovery though.
Re: Steve McIntyre (#5),
Strictly speaking, the Socotra d018 is the only proxy passing the significance test for all three periods and having an opposing sign between the “late miss” and “early miss” periods. But most of the proxies (tree ring, coral etc.) are evaluated with a one-sided test, so this is not a surprising result.
Of the 484 proxies passing the 1850-1995 significance test, 342 also passed both sub-period tests (with 341 having r values with matching sign). 111 passed only one of the sub-period tests, and 31 failed both sub-periods.
Just wow.
The T rise during the last 150 years is some 0.3 – 0.4 percent in the absolute scale (which is the way T enters most physics equations). During that same time some other (especially local) factors that affect the “proxies” may have changed far more, due to human activity. Doesn’t that increase the chance of getting spurious correlations with T beyond that expected from pre-1850 noise?
Suppose some effect is sensitive to two factors:
Y = a * (T / T0) + b * (X / X0) + other_noise,
and T changes by 0.4 percent while X changes by 4. If the changes in T and X are well correlated, and a isn’t much greater than b, I think one may get a good correlation with T due mostly to b.
Just because the Team calls things “proxies” doesn’t mean that they are. You’re reading me into this than is there.
🙂
I tried to emulate Mannian flipping. What happens if the sign of “low frequency” correlation changes but not the sign of the “high frequency” correlation. It all depends.
OK, I’ve pretty much emulated the late-miss AD1000 step. It’s really a bit pathetic how bad this study is.
Re: Steve McIntyre (#10),
And this is the study Gerry North wants you to “move on” from?
What good is it to have statisticians like North and Nychka involved in this stuff if they won’t “represent the ‘hood” when needed?
Sorry to vent Steve. I know it’s innappropriate, but I’m really, really disappointed by the work the statisticians that should be improving climate science.
Re: Steve McIntyre (#10),
I would never have even guessed when this paper first came out. I can’t wait to see the writeup.
Its the flapjack proxy or a new form of quantum data which exists simultaneously in two states. No I know, it’s [self snip] – jetlag.
Industrial strength voodo correlations? Now that’s funny 🙂 lol
Anyway, I think I prefer this Ooga Chaka
Here’s an Ooga Chaka complete with garbage cans – GIGO ? This seems to be the most appropriate Ooga Chaka so far.
Re: Steve McIntyre (#12),
The first OogaChaka was hysterical the trash can one is even better.
Regarding the post, is it possible to have a second comment on Mann08? Or is that only possible if Mann attempts to rebut you?
#12 Looks like Al Gore with the blue bin.
Is there an explanation somewhere of what late-miss/early-miss corresponds to ?
But, poor old man(n), thou prun’st a rotten tree
That cannot so much as a blossom yield,
In lieu of all thy pains and husbandry.
–AS YOU LIKE IT
LOL. Are we witnessing the ultimate cherry harvest ever? A ROBUST vintage cherry harvest for a perfect cherry pie? I’m still struggling for a metaphor for an upside-down cherry, though.
Seriously, Some of the Team’s works may turn out to be the most egregious examples of bastardized science in history. If they had just backed down a little bit, years ago, they could have pleaded ignorance and “we just made a mistake.” But now, they are toast. Ego.
snip
jae
“Reminds me of Nixon, somehow”
Funny. It reminds me of Clinton. Guess it’s a matter of perspective.
Wonder why he changed the sign for just one part of the series? Why not allow arbitrary rescaling of the data at every point. Presto! Perfect proxies.
“oh what a tangled web we weave….”
gee, do you think this affects the stationarity assumption? wow
Actually the first version of Ooga Chaka made my hair stand up on end. It was either that or the sign flipping in the so-called proxies. Maybe both.
Wow.
Could someone tell me what this means in simple language?
Re: Andy (#25), in simple language, Mann has a bunch of proxies that fall into three buckets: temperature predictors, temperature anti-predictors (ie, the negative of the proxy is a “predictor”), or junk.
Now, a proxy should not be in two buckets. It especially cannot be both a predictor ant an anti-predictor. But that, apparently, is exactly how they are used: depending on the time period Mann is predicting, he changes whether a proxy is a predictor or an anti-predictor.
Mugwump, as i understand it the sign change is done automatically by the algorithm… any anomaly is taken to be a “connected” anomaly and it’s sign adjusted accordingly. This particular proxy had its anomaly over one period inverted because over that period it was inversely proportional to the signal it was being calibrated against, whereas against another period it was proportional to the signal and so was not inverted despite being the same proxy.
Could someone confirm this understanding is correct ?
Re: anonymous (#26),
That’s my understanding too. But why stop there? Why not adjust the magnitude of the proxy while you’re at it, not just the sign. And why change all data points uniformly? You’ll get a much better fit applying independent scaling factors to every data point.
[Of course I am being sarcastic. The point is any manipulation that adjusts the data to match the signal is non-kosher. If he’s willing to go down that road, why not go the whole way?]
While not a tree, I need to point out that Willhelm Tell as well does hide behind some sort of woody plant, an elder bush to be precise.
The title of this thread is “Industrial Strength Voodoo Correlations” but perhaps “Mannufactured Correlations” might have been an alternative and otherwise appropriate title.
At any rate, it is becoming clear, as other CA readers have suggested, that a comprehensive paper dealing with the full spectrum of statistical issues to be found in Mann 2008 Et Al would be a most useful and informative addition to the body of peer reviewed climate science literature.
Re: Scott Brim (#29), I am afraid it would need to be a monograph. This is kind of like watching a magic show–what comes out of the hat next, ladies and gentlemen? You simply won’t believe your eyes.
.
Amen.
#31. It crossed my mind a couple of years ago that you could make a very stimulating statistics course simply on the MBH laboratory of horrors. It gives lively examples of how not to do things. If you add M08 into the mix, you could have two courses,.
Re: Steve McIntyre (#33),
They would make a great course or lab in practical applications for statistics. But why limit it just statistics? When you consider all of the errors (e.g., statistical, procedural, coding, archiving, etc.) in Mann’s work that have been revealed on this blog over the last several years, they don’t point as much to a misuse or misunderstanding of statistics as they do to some combination of carelessness, lack of attention to detail and/or of ability or desire to thoroughly analyze a problem and select proper in depth solutions. Add to that, the fact that all of the referenced papers have passed peer review and been published in major journals, and this doesn’t bode well for climate science, or in fact, science in general. Maybe an introductory course in the scientific method would be more appropriate. It does seem to have fallen out of favor of late.
Joe
#32. What made you think that Mann didn’t also adjust the scale of the proxy differently in every case? 🙂
Re: Steve McIntyre (#34),
oh dear
Re: Steve McIntyre (#34), I agree with Jeff (post #47 above) to the extent that I find a lot more fascinating and illuminating content in the subsequent discussions, particularly the #34 I chose to address this comment to…
Re #25, #16 steve, I wonder how many readers understand what you are talking about – what is meant by ‘voodoo correleation’, ‘Mannomatic methods’ and ‘ex post selection of proxies by correlation’. If your blog is to have impact beyond a small clique, it is necessary to explain this clearly. Forgive me if I missed it, but I don’t recall you doing this.
As I understand it (please correct me if I’m wrong) The Mann et al process is essentially:
a. Choose a large number of ‘proxies’ (though they might as well be random number sequences) covering say 1000 yrs.
b. ‘Screen’ the proxies, ie only pick those that correlate with the instrumental record over the last 100 years.
c. Plot the average of those ‘proxies’ that pass the screening test – this guarantees you a hockey stick, because they will average out roughly to the instrumental record over the last 100 years and zero before then.
This is obvious of course (except to Michael Mann and his co-authors and the referees of the paper) and can be shown by a simple code using random walks, as shown by Lubos.
Re: PaulM (#35),
..and if you want cooling trend, you need to fix PC1. Interesting story as well.
Paul M: I think you also need a “few good proxies” to get a good HS.
Let’s see if I got this right. You need a Ph.D. to write this. Acceptance guaranteed.
What do you need do get a sound critique accepted?
#41. The sociology is interesting. Psrt of the sociology is surely that specialist climate scientists stand by like bumps on logs (the Silence of the Lambs in an earlier post). For example, Peter Brown knows (because we discussed it here with him) that Mann used Brown’s drought proxies upside down – combining this with cherry picking. But Peter Brown doesn’t have the faintest interest in picking a fight with Mann.
Or Dominic Fleitmann, who turned up here for a comment or two. Ironically it’s Fleitmann’s series that is opportunistically used in two different orientations.
Re: Steve McIntyre (#42),
It’s just that cavalier attitude, not caring whether ones peers get it right or wrong, that could eventually put a lot scientists out of business. I agree with most on this blog in that we desperately need to get the science right.
snip – one of your words refers to a topic strictly not allowed here
Don’t get me wrong, I am a great admirer of this blog and read it every day, but this post is a great example of why this blog has less influence than it should. The opening sentence, as PaulM mentions, would be baffling to an outside reader, who may not read on, and the main finding, which seems quite dramatic to me, is found in the comments (#5) not in the blog entry itself.
I suspect that the principals on the other side of the debate will be happy to ignore this unless something serious is done to publicise it, and nothing will come of it.
“snip – one of your words refers to a topic strictly not allowed here”
Oops, sorry Steve. I fully understand your position on that one. My bad!
Joe
Oops, it was #45, not #47… my bad…
Isn’t the point of a blog, or at least one of the points, to elicit relevant comments from readers? Maybe a thought is stuck and needs some prodding from the masses to complete it?
Mark
There seems to be some confusion among readers about this post. So I believe the following graph is a useful aid (and could even be usefully incorporated into the original post).
Essentially Steve is discussing the validation procedure in Mann et al 2008, presumably the NH CPS reconstruction for period 1950-1995 (last part of graph), which in turn is based on proxy calibration to instrumental temperature in the period 1850-1949 (“late miss” in Steve’s terminology).
In case anyone wants to refer to the SI or supplementary data, they are available here.
SD1.xls contains the “passing” values of r of each proxy for the three periods (1850-1995, 1896-1995, 1850-1949). The 1950-1995 reconstruction is based on the proxies found to be significant in the 1850-1949 period.
Here are some initial thoughts about the “opportunistic” use of the Socotra dO18 proxy. It does seem to me that the greatly varying correlation of this proxy, depending on the period chosen, could argue against its inclusion in paleoclimate reconstruction.
But what would be the implication of its exclusion for validation? To me, it seems that removal of the proxy would improve the validation scores, at least for the 1950-1995 recon. Why? Well, since the overall correlation is strongly positive, it is likely that the correlation in the 1949-1995 period was also positive. But as Steve has noted, in that reconstruction, the proxy was used in a negative orientation based on 1850-1949 calibration.
More generally, perhaps the CPS reconstruction would benefit from a screening procedure that would weed out the proxies with correlations that fluctuate wildly over the instrumental period (e.g. specify a maximum difference in r between the first half and second half of the calibration period as an additional screening criterion).