New Scientist on the Hockey Stick

New Scientist ran a lengthy article on the Hockey Stick. They seem to have talked to everyone involved except Ross and I.

In 2004, even before our GRL article published, a freelancer for New Scientist had got interested in the story and spent a lot of time interviewing me on the telephone. It got to a very advanced stage and then got spiked by the New Scientist editor, following some ExxonMobil type disinformation of the type that Mann sent to Natuurwetenschap to try to prevent publication there

The editors decided not to publish it after all. Your connections with the oil industry raised doubts in their minds about your disinterested independent researcher status and the scientific corroboration from other groups for Mann’s findings persuaded the editors that the story simply did not stand up.

First of all, New Scientist argued that the hockey stick itself doesn’t “matter” to the AGW debate.

The hockey stick has been repeatedly misrepresented as the crucial piece of evidence when it comes to industrialization and global warming. It is not. Even if the hockey stick were shown to be a doodle that Mann did on a napkin during a night out, the evidence that the world is getting warmer and that this warming is largely due to human activities would still be overwhelming.

Having done so, they pose the multiproxy problem as follows:

Leaving that aside, did Mann get it right? . There is no doubt that reconstructing past temperatures from proxy data is fraught with danger. Take tree ring records. They sometimes reflect rain or drought rather than temperature. They also get smaller as a tree gets older so annual or even decadal detail is lost.

To reveal the “signal” behind the noise of short-term and random change, a proxy record for one region must be based on as many tree ring records as possible. It must also correlate with direct measurements of temperature during the period of overlap” which adds another layer of complication as in some cases human factors such as pollution might have affected recent tree growth.

It’s not the worst statement of the problem. However, using “as many tree ring records as possible” is not a sensible strategy if your methodology is a data mining methodology, such as both stages of MBH98, the tree ring PC methods and their multivariate step. They then ask the question about whether the proxies are any good, quoting Jacoby on a series from central China “making tree ring people angry”

So the first question is whether the proxy records Mann chose are reliable indicators of temperature. Some have been questioned. He has a series from central China that we believe is more a moisture signal than a temperature signal, says Jacoby. He included it because he had a gap. That was a mistake and it made tree ring people angry.

Mann accepts that some of the measurements he used do not directly represent temperature change. His argument is that, for instance, coral records showing rainfall records in the Pacific are proxies for El Nino cycles and so for changes in ocean temperature. Jacoby is not convinced”

I’ve never seen any criticism by tree ring people of this series from central China (which doesn’t affect MWP or 15th century issues.) I think I know which one is involved, but it would be nice to confirm. And why would tree ring people dispute this Chinese series and not dispute the 11 instrumental precipitation used by Mann, which include the delicious mislocations of French precipitation records to North America and the still unlocatable precipitation record assigned to Bombay (or many other precipitation-related records in MBH98). The New Scientist then quotes Mann on regional aspects (following a theme of Crowley’s):

Indeed, the proxy records suggest that high temperatures in one region tend to be balanced out by low temperatures in another. The tropical Pacific for instance appears to have cooled in the Medieval Warm Period and warmed during the Little Ice Age, “The regional temperature changes in our reconstruction are quite large; it’s simply that they tend to cancel out” says Mann.

Mann’s reconstruction in the MWP and the 15th century only reconstructs one PC. Hence it has no regional properties in its early portions. (From the mid-18th century, his reconstructions generate regional patterns, but not in the early going.) I wonder what proxy records that Mann has in mind for this claim about the tropical Pacific in the MWP and LIA. The other issue with proxies cancelling out is that many of the proxies are simply noise (or precipitation related) and that might be the reason that they cancel out. Mann goes on to puff his error bars:

Mann also points out that he was one of the first to include error bars, which show how much variance is lost due to smoothing,

Of course, these error bars were calculated in MBH98 based on calibration period residuals, which were hugely overfitted. Had verification period residuals been used, the confidence intervals would have exceeded natural variability since r2 was ~0. Then New Scientist gets to the MM dispute as follows:

A more serious accusation has come from two non-climate scientists from Canada, who claim to have found a flaw in Mann’s statistical methodology. McIntyre and McKitrick claim that the way Mann applied this method had the effect of dampening down natural variability, straightening out the shaft of the hockey stick and accentuating 20th century warming.

There is one sense in which Mann accepts that this is unarguably true. The point of his original work was to compare past and present temperatures so he analysed temperatures in terms of their divergence from the 20th century mean. This approach highlights differences from that period and will thus accentuate any hockey stick shape if, but only if, he insists, it is present in the data.

The charge from McIntyre and McKitrick is however is that Mann’s computer program does not merely accentuate this shape but creates it. To make the point they did their own analysis based on looking for differences from the mean over the past 1000 years instead of from the 20th century mean. This produced a graph showing an apparent rise in temperatures in the 15th century as great as the warming occurring now. The shaft of the hockey stick had a big kink in it. When this analysis was published last year in GRL, it was hailed by some as a refutation of Mann’s work.

The data mining from Mann’s PC methodology has been well-publicized, but people tend not to get the nuances right. You can get a hockey stick shaped PC1 from series in which there is no hockey stick shape in the underlying data. His method will pick out and overweight series with 20th century trends and flip the series so that the trends are all in the same direction. In the shaft of the stick as you get away from the common 20th century feature, the noise cancels out. Since the PC1 is a weighted average of the various series, the noise features cancel out. The variance of the average is small in the shaft, but large in the blade.

In the empirical situation of the North American tree ring data set about which there’s been so much dispute, there actually are some hockey stick shaped series in the data set (bristlecones). Given that the “Artificial Hockey Stick” effect exists with red noise, it doesn’t take much imagination to contemplate that it’s really enhanced by actual hockey stick shaped series. But of you take out the bristlecones, there’s no hockey stick (and bristlecones were known beforehand to be problematic)

New Scientist fairly states that we do not present an alternative reconstruction. It’s too bad that Climatic Change seems unable to address this error in Wahl and Ammann (even though it’s been pointed out to them):

M&M say their work is intended only to show that there are problems with Mann’s analysis. They do not claim their graph accurately represents past temperatures.” We have repeatedly made it clear that we offer no alternative reconstruction”, McIntyre states on his Climate Audit blog.

Then they get to Wahl and Ammann:

The work of Eugene Wahl of Alfred University and Caspar Ammann of NCAR in Boulder CO raised serious questions about the methodology of Mann’s critics. They found that the reason for the kink in the McIntyre and McKitrick graph was nothing to do with their alternative statistical method. Instead it was because they had left out certain proxies, in particular, tree ring studies based on bristlecone pines in the SW of the US.

“Basically the MM case boiled down to whether selected North American tree rings should have been included, and not that there was a mathematical flaw in Mann’s analysis” Ammann says. The use of the bristlecone pine series has been questioned because of a growth spurt around the end of the 19th century that might reflect higher CO2 levels rather than higher temperature and which Mann corrected for.

Once again, Ammann has misrepresented matters. Wahl and Ammann have a nasty habit of taking results that we reported, claiming them as their own and then reproaching us for not reporting them – a point that we made in our GRL Reply to A&W. After all the ink that we’ve spilled on bristlecones, it rankles that Ammann says that we’d “left out” the bristlecones. We didn’t “leave out” the bristlecones. We showed that MBH results were sensitive to the presence/absence of bristlecones or to the presence/absence of the bristlecones in the PC1. We were completely explicit about bristlecones. We did not claim that the MBH results came simpliciter from their PC method, but that the bad method interacted with the worst proxies. As to the famous CO2 adjustment, Mann did not make any adjustment to MBH98 for CO2 – that is pure disinformation. Plus his CO2 adjustment – based on 19th century saturation and low-frequency wiggle matching – is nonsensical.

Then New Scientist gets to the “other studies”:

What counts in science is not a single study.. here Mann is one a winning streak – upwards of a dozen studies, some using different statistical techniques, different combinations of proxy records (excluding the bristlecone pines) have produced reconstructions more or less similar to the original hockey stick”

Whatever the flaws in the original work, it seems the broad conclusion is correct. McIntyre is not impressed. “There is a distinct possibility that researchers have either purposefully or subconsciously selected series with the hockey stick shape”, he told one reporter.

Readers of this blog are familiar with my position on the other studies. They are not “independent” in authorship. The proxies are not “independent”. Bristlecones are used repeatedly despite their problems.

Since MBH, a couple of other hockey stick shaped series have turned up (e.g. Yamal, Jacoby in Mongolia). It only takes a few series to imprint a hockey stick shape on a small subset. Osborn and Briffa 2006 tried to show that their reconstructions could survive 3 series deletions, but they started off with 2 bristlecone/foxtail series, Yamal, Mongolia and Dunde- all the stereotypes.

The Hockey Stick studies all have nearly identical statistical problems – failed Durbin-Watson in calibration; collapse in verification r2 values and a spurious RE statistic.

This entry was written by Stephen McIntyre, posted on Mar 27, 2006 at 5:16 PM, filed under News and Commentary and tagged new scientist. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

133 Comments

Dano

Posted Mar 27, 2006 at 7:25 PM | Permalink

Basically, the planet is warming rapidly and the evidence abounds for that rapid warming. So Job 1 is to ensure that societies can cope.

Secondly, the value of showing that anthro warming is something that has happened before…um…wait. It hasn’t happened before. So, we must want to ensure that fossil fuel companies don’t get Big Tobacco-like lawsuits.

Why else would we quibble? What other of our ideologies are offended by the knowledge that man has altered life on earth? The primacy of Humanism? Why else delay actions that allow us to cope?

Best,

D
TCO

Posted Mar 27, 2006 at 7:27 PM | Permalink

Does the method really flip upside down hockey sticks?
kim

Posted Mar 27, 2006 at 7:35 PM | Permalink

Why wonder, Dano, when it’s so much easier to just believe.
===================================
TCO

Posted Mar 27, 2006 at 7:36 PM | Permalink

I want palm trees and gators. I’m not noticing appreciable warming damnit!
jae

Posted Mar 27, 2006 at 7:36 PM | Permalink

Dano: Just what are you proposing that we do that will help significantly? Again, have you read Lomborg’s book? (Your answer: no, I would not touch that right-wing piece of trash; I would rather read my kind of publications, which contain the only truth.)
kim

Posted Mar 27, 2006 at 7:41 PM | Permalink

The actions, Dano, that you would find necessary to cope, are those designed to ameliorate an unproven anthropogenic componenet to unproven warming. What would you have us do to address ‘Job 1’ if there were proven warming and no proven anthropogenic component?
================================
Paul Penrose

Posted Mar 27, 2006 at 7:41 PM | Permalink

Dano,
Please state again how the models help prove that AGW is occuring, I really need a good laugh about now.
kim

Posted Mar 27, 2006 at 7:48 PM | Permalink

I’ll tell you right now that one of the very best methods for societies to cope with change is to implicate the gods anger about human misbehaviour. This appears to be a classic response. But please don’t sacrifice my virgins for your superstitions.
=============================================
Lubo Motl

Posted Mar 27, 2006 at 8:34 PM | Permalink

It’s painful that someone runs an article about a topic without two people who are most famous for it. But we are used to these things, and I hope that you are used to it, too, Steve and Ross.

Also, it is very characteristic that the hockey players jump from one problem to another. Every time you (or someone else) shows that there is something wrong and a statement of theirs was incorrect, they jump to a different topic and make another (usually equally unjustifiable) statement. They never forget to mention that the consensus always holds despite an arbitrary number of errors that one can find with the arguments. A sleeky kind of soft science.

But the reality is very different. In order for policies such as Kyoto to be justifiable, they need all of their beliefs to be valid simultaneously. The Earth must be warming, it must also be warming in the future, the warming must have primarily anthropogenic origin, this origin must be based on CO2, moreover warming must be a bad thing, and suppression of industry etc. must be one of the best ways to avoid warming, and so forth.

It is clear that about 1/2 of these statements are probably wrong, but once again, they need all of them to be correct because the reasoning that justifies such policies depends on all of them. Analyses are not what really matters in such magazine creations. What matters is what people like Dano want to hear.
John G. Bell

Posted Mar 27, 2006 at 8:50 PM | Permalink

Steve,
When you talk about an article on your blog please include its title so that a google search will bring someone to your site. What I find so wonderful about the web is the ability to pull up an article and a reply like the one you have just made. Helps to understand the issues.

It is a good reply. Hope as many people as possible get to see it.
Ross McKitrick

Posted Mar 27, 2006 at 9:01 PM | Permalink

Not only did they not talk to us, it looks like they didn’t even read the articles. They refer to the graph in our E&E03 article, with the 15th century kink, as being in our GRL05 article. I suppose it’s “progress” that Mann admits his method accentuates hockey stick shapes as long as they are present in the data. Now if only he will admit that he knew the bristlecones didn’t belong in the data set, and that his claim of robustness was always untrue, then he’ll be catching up to reality.
Ammann as usual misrepresents everything. I wonder why he feels such a need to keep jumping in on the debate, while being determined to give a phony spin. He makes it sound like you have to remove all the North American tree rings to wipe out the HS shape, when it’s just 16 bristlecones. And surely he knows that the “alternative” statistical method does matter, quite apart from the fact that it was never disclosed as such, since it makes the difference between the HS shape being assigned to PC1 with an exaggerated 37% explained variance, versus appearing in PC4 with less than 8% explained variance.
Follow the Money

Posted Mar 27, 2006 at 9:39 PM | Permalink

“Your connections with the oil industry”

Do you have any? Not that it doesn’t demean your research. Necessarily.

What’s shocking is those participants in discussions who intimate there’s no financial interests at all on the Kyoto side, only on the side questioning carbon based global warming. Now why would British Petroleum be pushing Kyoto? Competitive advantage over others less invested in natural gas. And so many other goodies.

UK press says sith angst Kyoto targets won’t be met. Well, Doh!, that’s what the Kyoto Carbon Credit Casino is for. By fictional reductions from Gazprom! (Schroeder knows the game). Launder some money to partners in China on their “promise” to reduce output at some coal plant. They promise to turn over validating statistics!

I noticed in Canada the new Conservative regime barred international carbon credit trading. But they’re keeping open domestic options. No need to give up the scam completely, for domestic corp’s sake.

But it’s dangerous out there. Be careful. And don’t invest in any European tree farms. They may soak up carbon but the little guys are not for whom this game is being played. Ergo, the IPCC, etc. finds dupe scientists to debunk the tree claims. Recently happened, you may have read.
nanny_govt_sucks

Posted Mar 27, 2006 at 10:30 PM | Permalink

#1: “Why else delay actions that allow us to cope?”

There are a number of ways to “cope” that don’t involve tearing down the free society economomy that supports us all with a comfortable living (unless you live in a cave and make your own clothing, that is).

Adaptation is already inherent in our free society. People can move freely to places where the climate suits them. Goods and services, including those that help us adapt like air conditioners and heaters, can flow pretty much freely from the producers to the people who need them.

And if you believe that CO2 emissions are harmful, then why not call for the end to direct and indirect subsidies to the fossil fuel industry? Ending this type of corporate welfare is another step in the free society direction, and will likely raise the cost of gas at the pump leading to less consumption and a bigger market for alternative fuel vehicles.

Here’s one of those subsidies that blew my mind recently :

Government may waive near $7 bln in oil, gas royalties: report
http://www.freerepublic.com/focus/f-news/1578337/posts

But handing power and control of our lives and our economy over to politicians via GW legislation and international accords is the wrong direction, I believe. At the end of that line is socialism, totalitarianism, and the destruction of individual freedom. We’ve seen how that doesn’t work. I want to “cope” in the other direction.
Nicholas

Posted Mar 27, 2006 at 11:30 PM | Permalink

Re: TCO: “Does the method really flip upside down hockey sticks?”

My understanding is that the way that Principal Components works is that if you have n series it effectively plots them in an n-dimensional space, then finds the plane through that space in which there is the maximum amount of variance. Since the series are being calibrated against the “instrumental temperature record” which spikes in the late 20th century, the series will be oriented such that they correlate with that spike. As I understand it the PC method doesn’t care how the series were originally oriented in that n-d space, it just looks for greatest correlation, and that includes in the negative direction. So yes, I believe if you calibrate against a climbing instrumental record, any series with a spike at the end, be it up or down, will contribute to an upwards spike in the result.

Mr. McIntyre understands this all better than me, I’m sure, and can give you a better explanation. But I’m pretty sure, even if I got some details wrong, the gist of what I wrote is correct.
Steve McIntyre

Posted Mar 27, 2006 at 11:44 PM | Permalink

Yes. In our E&E article, we pointed out that if you artificially increased all the non-bristlecone tree ring proxies by 0.5 in the North American network for the period 1400-1450, the Mann method would flip all these series over and actually REDUCE the temperature index for the 1400-1450 period. It’s a laughable method.

Now in the actual North American network, the hockey stick shaped series (bristlecones) all happen to be upward pointing hockey sticks, but the PC method does not use that information. They could be equally distributed half and half down and you’d get the same PC1. That’s one reason why I don’t like PC methods for these sort of systems.

You’re better off to pick a class that you believe ex ante to contain a temperature signal and then try to extract the signal, if it exists, by a mean.
David Stockwell

Posted Mar 28, 2006 at 1:41 AM | Permalink

#15. Is it so odd? I mean, if you are going to use a correlation with something, a negative one is as good as a positive one. So groups of tree that together respond negatively to temperature could be equally as good as those that respond positively?
Peter Hearnden

Posted Mar 28, 2006 at 2:04 AM | Permalink

Re #5 and others. By driving cars that do 50mpg+ (as i do) by not, as a species, flying around like hordes of desperate mossies (as people do but I don’t), by using trains rather than planes (I love trains, you can see so much), by (shock horror!) getting out of cars and walking or biking, by not being desperate to consume consume consume untill you (literally for some, and in the US a lot) burst or drop, by looking up at the air and thinking ‘Yes, if I had it may way I’d try to leave it as I found it – I’ll do what I can’, by studying the atmosphere, recording it, doing studies to see how it behaved in the past, and how it will behave in the future, by having concern for our jewel of a planet and, (since someone will certainly try to brand me religious about the place and thus dodgy – I’m not btw) having a thick skin, by putting contrary views here and other places and taking the flak, by joining with other to do all the things above, by not living a life where the only things that seems to make it worthwhile for most is excess, not a good woman (or man or partner – whatever your preference), or the pleasure of the company of others, or the simple pleasures of life (I’m looking forward to the wonderful greenness of a sunny Devon spring day after rain – breathtaking), by reducing inequality in the world (I know I’m in trouble with that one – S word…). And, by not forcing people to do any of the above, but just being allowed to urge them without being called… well, you all know….for you efforts.
mikep

Posted Mar 28, 2006 at 2:06 AM | Permalink

Are you going to write a letter to New Scientist?
fFreddy

Posted Mar 28, 2006 at 2:44 AM | Permalink

Re #17, Peter Hearnden
An understandable view, if you happen to inherit a family farm in beautiful countryside. Those of us who are not so fortunate have to work for a living.
John A

Posted Mar 28, 2006 at 3:41 AM | Permalink

This would almost be palatable if it weren’t for the fact that Hearnden uses “red diesel” and pays a fraction of the price that the rest of us do.

I look forward to the day when Hearnden announces he’s going back to ox and plough. I might even cheer.
Peter Hearnden

Posted Mar 28, 2006 at 3:49 AM | Permalink

fF listen pal, I bl**dy work too! OK? And I earn JS! OK? You, I can guarantee, would not stirr from your slumbers for what I earn!

I do live in WHAT CAN BE beautiful countryside, I am lucky – it’s luck we’re happy to share. It’s also, at times: freezing, wet, dank, foggy, muddy, and horrid – you would not up with it put. You’re, to reply in kind, a soft townie who thinks milk comes out of bottles, or eggs boxes?

I *don’t* complain about farmers lot! OK?

Now, having dealt with the attack, you think my other points wrong?
Peter Hearnden

Posted Mar 28, 2006 at 3:52 AM | Permalink

Re #20. Utter bo**ocks! You ***KNOW*** my views on red diesel yet you still bang on about it… I can’t believe it’s the best attack you have?

It’s claptrap to suggest a varient of the ‘oh, you want us to go back to the stone age’ line about tacklng AGW. NO I DO NOT, NOR DOES ANYONE – understand????
Bryan Leyland

Posted Mar 28, 2006 at 4:11 AM | Permalink

Indeed, the proxy records suggest that high temperatures in one region tend to be balanced out by low temperatures in another. The tropical Pacific for instance appears to have cooled in the Medieval Warm Period and warmed during the Little Ice Age, “The regional temperature changes in our reconstruction are quite large; it’s simply that they tend to cancel out” says Mann.

But Philip Houghton in “People of the Great Ocean” (ISBN 0 521 47166 4) Says that the Polynesians discovered NZ – and as NS says in the same issue, Easter Island – around 1200 AD when he suggests (and he is an experienced sailor, pathologist and doctor) that the sea was warm and weather was favourable for exploration in ocean gong canoes where the ability to survive cold and wet conditions was critical. “Polynesians have the ultimate cold adjusted physique”. More so than the Eskimo, he believes.

But shortly after 1200, Polynesian exploration and two way voyaging seemed to wind down. Perhaps the Pacific weather got colder and windier in the LIA?

To me, all this seems to contradict Mann’s claim.
Bryan Leyland

Posted Mar 28, 2006 at 4:16 AM | Permalink

Apologies for the multiple posts. It seemd to be rejecting my post, but obviouly it wasn’t.
John A

Posted Mar 28, 2006 at 5:29 AM | Permalink

Utter bo**ocks! You ***KNOW*** my views on red diesel yet you still bang on about it… I can’t believe it’s the best attack you have?

It’s claptrap to suggest a varient of the “oh, you want us to go back to the stone age’ line about tacklng AGW. NO I DO NOT, NOR DOES ANYONE – understand????

What you’re implying is a retreat to the 19th Century or even earlier. You forget that those same dread aeroplanes that bring Mr and Mrs Smith and kids to all parts of the globe on holiday, also bring in goods, especially foodstuffs from all parts of the globe. The farmers in Africa who sell food that you find in your local corner store, won’t thank you for cutting them from their markets in order to save them from a theoretical threat born entirely in the bowels of a computer and the bizarre unphysical beliefs of the modellers.

They would be cut off from the world market in food which has developed primarily with the advent of powered flight. They are too far to be served by trains Should I give them your address so that they can thank you personally for impoverishing them? Should I tell the people in Africa who serve the tourists who bring with them lots of currency that they use to themselves out of poverty, that a West Country farmer’s apocalyptic beliefs about climate condemns them back to the Stone Age?
Louis Hissink

Posted Mar 28, 2006 at 5:31 AM | Permalink

People,

Cool it.

I received some emails pointing to data which seem to contradict Gavin Menzies’ ideas of the activities of the Chinese Ming Dynasty.

My gut feeling is to shout a “hold it” and review the evidence extant.

Here we, in a climate sense, might be at a similar threshold.

Louis
TCO

Posted Mar 28, 2006 at 6:34 AM | Permalink

A. It’s pretty evident that they dusted off the old work and made a new article out of it. Is it the same author? So yes, they did contact you. They didn’t contact you for an update.

B. I still find it hard to beleive that the method flips series. Are you sure that this occurrs? Can you see it happening with any of the series in MBH? Surely there must be some that are anti-hockey stick. (Thinking out loud: I guess this might have validity if you had a proxy which showed temp by getting smaller, like treerings on the wrong side of the U.)

C. Ross, I continue to not understand the amount of the sin created by the mining method:
1. 38% variance versus 8% variance is easy to understand. But why the comments about what PC something is in. What IS THE GRAPH?
2. What happens if you just take 9 flat lines and a hockey stick and MBH them? Straight averaging would give the result of .1 of a hockey stick. What does MBH give in that case? BTW, this is a hell of a lot easier to understand then your red noise stuff…
3. Mann comments that straight averaging gives the same result as MBHing. Then we have the kerfuffle about geographic extent. but what does a straight average (with some geographic normalization–and how should one do that) give? Also, why would one expect PCA to correct for geographic extent? Unless you do a multiple regression with geography as a variable, how can PCA do anything magic?
Steve McIntyre

Posted Mar 28, 2006 at 7:29 AM | Permalink

A. The authors are different. I contacted the first author and he was unaware of the new article. I don’t think that the new article used the old article at all.

B. Yes. That’s what PC methods do. When you apply Mannian PC methods to red noise – with some series going up and some going down – it aligns them all to create the maximum 20th century trend, but the common feature wears off in the shaft and you get the averaging effect reducing variance in the shaft. The method is biased since it nearly always generates a hockey stick- it mines data sets; it does not average them. You can get a similar effect by cherry picking. If you simply pick series from red noise networks that are trending up in the 20th century and average the selection, you get HS as well. For example, Jacoby picked the 10 “most temperature sensitive” series out of 36 and averaged them. If you simualte 36 series with typical autoregressive persistence and pick the 10 ones with biggest closing trends, you get a HS.

In the North American tree ring network case, the bristlecone hockey stick series actually do have a consistent sign. But this information is not used by Mann’s PC method – it wouldn’t matter whether they were up or down to Mann’s method. It really is that awful a method. And there are other bad aspects in the later parts.

When we realized how bad the method was, we used it as a tool to see what it did in the actual and then controversial (after MM03) North American data set. After MM03, Mann had said that his North American PC1 was the dominant component of variance. But when we followed the method, it was just the bristlecones in the PC1. It wasn’t flipping the series (but it would have). The problem was that the series were known to be bad and yet these were the ones that imprinted the final result. Mann knew that the series were bad and that his results failed without them – the CENSORED file. Given the abuse that we were then taking, it blew me away when I figured out that the CENSORED file was his results without bristlecones. Yet they said their results were “robust” to the presence/absence of all dendroclimatic indicators. The bad method interacts with the worst proxies.

C. PC methods do not normally “preserve” the identity of individual subsets, unless the subset is a pattern that is orthogonal to the space spanned by the other series. In this case, in terms as simple as I can make them, the pattern created by the average of the bristlecones is orthogonal to the 50 other series. Thus even if you use ordinary PC methods (which are no magic bullet), the bristlecone pattern remains distinct and can be identified in a slightly different blend in the PC4.

C2. Good point and excellent example to illustrate data mining on outliers. It’s a good math approach and I should have thought of it. If you take 9 straight lines and 1 hockey stick and MBH them, you will get 100% weighting in the PC1 on the 1 hockey stick.

C3. Straight averaging does not give the same result for MBH98 as MBH’ing. Look at our NAS presentation where we show the simple average of all MBH proxies. Of course, if you cherry pick a small subset of 14 proxies out of 415 and then average them, you can get a hockey stick using a “different” method. That’s what the non-Mann studies do.
TCO

Posted Mar 28, 2006 at 7:36 AM | Permalink

So it’s a crappy method that might flip series (but didn’t in this case) or one which exaggerates the extent of a small number of hockeysticks (but didn’t in this case…because there wasn’t a small number there was a large number).

Also, IS THE MBH GRAPH a PC1?! If not, why do you talk about that vice the effect in aggregate?
TCO

Posted Mar 28, 2006 at 7:44 AM | Permalink

I would think about disaggregating things into a few different sins.
a. Using a method which is susceptible to errors (which did not occur in this example…thank God…but let’s avoid this in the future…still…for the danger it presents).
b. Using a method which DID lead to errors in this example (quantify how much impact).
c. Using suspect proxies.
d. Not admitting how the reconstruction differs without the suspect proxies.

Your comments about PC1 or 4 or the like are confusing. Isn’t what we care about how much of the GRAPH comes from the errors? IS THE GRAPH the same thing as the PC1?
Steve McIntyre

Posted Mar 28, 2006 at 7:52 AM | Permalink

#29. It does flip series, but it doesn’t flip the bristlecones. Other series are flipped in the PC1 to align better with the bristlecones, enhancing the blade.

No, while there are 20 bristlecone sites, the HS effect really comes from less than 10 of them. They all come from one researcher, Donald Graybill, who was actively trying to prove the existence of a CO2 fertilization effect. If you look at high-altitude sites not collected by Graybill, there is marked difference in HS-ness. I also want to see what happened at Sheep Mountain, the #1 weighted site, in the 1990s. It was collected by Hughes in 2002, but we haven’t heard a peep. My penny stock instincts tell me that if they had "good" results, we’d have heard by now.

The MBH98 NH reconstruction is a weighting of the North American PC1 and other proxies. Again it is not simple averaging but a sui generis multivariate method, which overweights the North American PC1 thereby imprinting the NH temperature reconstruction.

AS I’ve pointed out before, if you substitute tech stocks for the bristlecones and do an MBH procedure, you get a recosntruction with a higher RE than MBH. If you use the tech stock PC1 and white noise for the other proxies, you do even better.

Half the problem is that it’s hard to believe just how bad their methods are.
TCO

Posted Mar 28, 2006 at 8:00 AM | Permalink

It does flip series, but it doesn’t flip the bristlecones. Other series are flipped in the PC1 to align better with the bristlecones, enhancing the blade.

How much of your criticism of the method (wrt “flipping”) is for the potential error and how much for the actual damage? Both are appropriate criticisms, but they are different. One is a warning on why not to use the method as a tool for future work, the other implicates the results obtained. Numerically how much of the “hockey stick index” (your definition) is due to “flipping”? What are some examples of flipped series in the MBH work? Why the caveat about the PC1? Do the errors that propogate into the PC1 (from flipping) continue into the aggregate (that’s what we care about)?
TCO

Posted Mar 28, 2006 at 8:12 AM | Permalink

No, while there are 20 bristlecone sites, the HS effect really comes from less than 10 of them. They all come from one researcher, Donald Graybill, who was actively trying to prove the existence of a CO2 fertilization effect. If you look at high-altitude sites not collected by Graybill, there is marked difference in HS-ness. I also want to see what happened at Sheep Mountain, the #1 weighted site, in the 1990s. It was collected by Hughes in 2002, but we haven’t heard a peep. My penny stock instincts tell me that if they had “good” results, we’d have heard by now.

a. I think the better comparison is to how many total series (bristlecone and non-bristlecone) there are and what percent of the HS index comes from the 10. Not the 10 of 20. How does it differ from straight averaging? (my one hockey stick and 9 flat lines example). Aslo, you need to be clear about the end effect after feeding through the mannomatic. Not the impact on PC1 which is an intermediate result.

b. The part about the CO2 impact is a seperate issue, Steve. If the method is a flawed statistical analysis, it will be flawed even if the CO2 effect is not in existance (and this is more suspected than proved, mind you.)

c. The divergence problem is a seperate issue. If the statistical analysis is a flawed method, it will be flawed wether or not the divergence problem is as suspected.
TCO

Posted Mar 28, 2006 at 8:18 AM | Permalink

AS I’ve pointed out before, if you substitute tech stocks for the bristlecones and do an MBH procedure, you get a recosntruction with a higher RE than MBH. If you use the tech stock PC1 and white noise for the other proxies, you do even better.

This is a seperate issue from the identified flaw in the statistical analysis (essentially a type of averaging/regression that overcounts certain features). I think the rejoinder here is that Mann is using methods that have some physical reason for being proxies (not only the correspondance in the measurement period). We can get into the nitty gritty of how much of a physical basis he has…and your example points to an absurd one that has no physcial basis, I think.) But still it’s seperate from the averaging/regression flaw.
TCO

Posted Mar 28, 2006 at 8:21 AM | Permalink

The MBH98 NH reconstruction is a weighting of the North American PC1 and other proxies. Again it is not simple averaging but a sui generis multivariate method, which overweights the North American PC1 thereby imprinting the NH temperature reconstruction.

a. I think what is relevant is the impact at the end of the Mannomatic. Not the impact on the PC1. That’s what I’m reacting to in your remarks.
b. What does “sui generis” mean”?
c. Does the method have regional weighting? Is location an input? If I have 10 Eastern series and 90 Western series, will his method “average” the subaverage of eastern with the subaverage of Western? Or just average all together irrespective of location? Or something in between? And which is more appropriate?
TCO

Posted Mar 28, 2006 at 8:27 AM | Permalink

Half the problem is that it’s hard to believe just how bad their methods are.

Agreed. They are also poorly documented. I think using new methods with new problems is a bad idea in general. I remember one of my colleagues using a very tricky and new method of TEM crystal structure determination on some (new to the world) layered, complicated oxides. One of the reviewers very reasonably pointed out that if a new method (and a new coumpound/structure) are being reported, that it would be better to show that the new method works on a KNOWN structure.
Steve McIntyre

Posted Mar 28, 2006 at 8:43 AM | Permalink

I’ll try to reply later. In our NAS presentation, we emphasize their use of a “new” method in a way that we didn’t before. “sui generis” means that its one of a kind; it’s not a usual statistical method about which confidence interval properties are known. (Hey, I’m old enough that I took 5 years of Latin in high school; people still did that back in the day.)

A lot of the effort in this process has been detective work and decoding. Some of the things that took a lot of time to reconcile “don’t matter” in the sense that they are merely sloppy but don’t change the results.

I agree that it is he overall Mannomatic method that matters. In our NAS panel presentaiton, we attempted to keep a focus on the overall method and not just the PC step as the otehr parts have a strong effect as well and not validated methods. Again look at the first figure of our NAS panel – the overall mean of the series does not have a HS shape. The HS shape emerges only by applying mannomatic methods. That doesn’t mean that the original proxies are any good.
JerryB

Posted Mar 28, 2006 at 8:44 AM | Permalink

TCO,

Do you want Steve to write journal articles, or do you want him to tutor you about the same stuff that you didn’t grasp a few months ago when you first started deluging him with questions? 🙂

Steve,

Is the article online? If so, can you post a link?
kim

Posted Mar 28, 2006 at 8:48 AM | Permalink

Basically, ‘it follows from itself’, or self-generated. It’s all barbar a moi.
===================================
Steve McIntyre

Posted Mar 28, 2006 at 8:57 AM | Permalink

Link is now at articles- right frame
Nicholas

Posted Mar 28, 2006 at 8:57 AM | Permalink

TCO, again I am not an expert, but here’s my impression of what the terms PC1, PC2, PC3, PC4, etc. mean.

Recall that I said that the PC method was similar to plotting the n series in n dimensions and finding the plane in that n-space in which the plotted data has the highest variance?

My understanding is, where data points intersect with plane results in the PC1. The signal of the PC1 is then subtracted out of the data. Then, another plane is fitted to the data to find the highest variance of *what is left of the signal* after having subtracted the PC1. This is the PC2. Then the PC2 is subtracted from the data, and another highest-variance data set is found for the PC3, which is subtracted, etc.

In other words, signals that are found in the PC4 are weaker than the signals in the PC1, PC2 and PC3. How much weaker? I don’t think it’s well-defined. Imagine your data only contains two signals, a very strong one and a very week one. The PC method will find the very strong one in the PC1 and subtract it, leaving the very weak one. Then the PC method will find this as the PC2 and subtract it. The rest of the PCs will be flat.

I’m sorry if this explanation is misleading, as I am only repeating what I have been told, and may have messed it up somehow in my brain between being told and explaining here. However, I think what I have said makes sense. Steve or someone else can correct me if I am way off base.
Jim Erlandson

Posted Mar 28, 2006 at 9:05 AM | Permalink

Today’s Wall Street Journal OpinionJournal has an interesting piece” “Kyoto? No Go. How to combat “global warming” without destroying the economy.”

… we must keep researching the real cause of climate change to understand better the sun’s solar output and the historical rise and fall of global temperatures.

Pete DuPont presents a reasonable and rational summary of what is known and unknown as well as what may or may not be done about it — it being “global warming” anthropogenic or otherwise. It reminds us that there are still more questions than answers and any response will be expensive. Very expensive.
kim

Posted Mar 28, 2006 at 9:13 AM | Permalink

The expense will be directly related to our need to believe we can moderate the climate. It should instead be directed at whatever final value judgement is made about the ‘right’ climate. And the process of valuing should be transparent and coherent. What are the chances of that?
===============================================
JerryB

Posted Mar 28, 2006 at 9:22 AM | Permalink

Re #40,

Steve,

I meant a link to the “New Scientist” article.
TCO

Posted Mar 28, 2006 at 9:27 AM | Permalink

Jer,

I think my questions are good ones. Steve knows a lot more than I do and works a lot harder than I do. And I give him credit for that. But still, my questions are right on target for disaggregating the issues. If someone on the NAS panel had asked them, would you have complained about that?
Jeremy

Posted Mar 28, 2006 at 10:12 AM | Permalink

Reading Dano’s reply, I’m struck by just how similar his reply sounds like someone from the Intelligent Design crowd justifying the hand of god in creation. Basically:

“All life is extraordinarily complex and the evidence abounds for a designer…”

Standard statement of “I don’t have to do any homework, I just know it to be true.”

Of course, the political side that those who so readily subscribe to AGW might pale fairly quickly if they ever did realize just how similar they sound to their favorite targets of ridicule.

Sad.
Ross McKitrick

Posted Mar 28, 2006 at 10:13 AM | Permalink

TCO, here are a few responses to your queries, and re-query if I miss something.

The order of PCs matters for interpreting them. I find it easier to get intuition algebraically rather than geometrically. Take an nxk matrix X. Suppose you want to approximate all k columns of X using a single vector q and a set of scalars a(1), … ,a(k). Your approximation to X will be [a(1)*q, a(2)*q, … a(k)*q]. This will leave you with an nxk matrix of residuals, call it E. If you choose q to minimize the trace of E’E, subject to the norm of [a]=1, then q will be the first PC of X. Now do the same operation on E, and the first PC of E is the second PC of X. Your approximation to E leaves a matrix of residuals F, and the first PC of F is the 3rd PC of X, and so forth. That’s why it matters if the hockey stick is in PC1 or PC4. PC4 is the stuff left over after the top 3 dominant patterns are accounted for, in an exercize ultimately aimed at identifying the one dominant pattern. In MBH99, they talked about the NOAMER PC1 as being essential to the overall results. If they had been talking about the NOAMER PC4 being essential, they’d have been laughed off the page. If your results based on PC1-PC3 give you one conclusion, then everything is reversed by adding in PC4, it doesn’t establish that PC4 really is the dominant pattern, since that contradicts the underlying definition of the PCs. What it shows is that you don’t have a very robust method.

The hockey stick itself is not merely a PC, it is a weighted average of the underlying proxies. The weights are determined by the PC process that groups proxies, and the regression-like operation that maps them to temperature data. The Mannomatic PC operation takes the bristlecones in the NOAMER network and boosts the weight on them, piling them into the PC1, thereby identifying them as the dominant pattern of variance. By inflating the eigenvalue associated with the hockey stick PC it suggests that they represent a strong, dominant signal in the data.

The flipping issue arises because the PC algorithm will add a minus sign to one of the a(.) coefficients if doing so reduces the trace of E’E, even if the physical interpretation of that makes no sense. If half the proxies trend up and half trend down, and the physical argument is they should all trend up, then common sense tells you they’re probably not well-behaved proxies. Take a PC1 from that group and the PC1 will trend up (or down) simply because it puts – and + signs in as needed to line them up. But that only masks the fact that the proxies are not well-behaved.

27 C3… as Steve says, we showed in our NAS talk what straight averaging looks like. The MBH method groups the data and then maps to temperature. There is an infinite number of ways to do it. You can get a HS without PCs, as long as you let the bristlecones dominate at the mapping step. But then you need to justify letting the bristlecones run the results. That’s one sin in the Mannomatic PC method–it gives a false justification for putting dominant weight on the bristlecones. The red noise argument is necessary for showing why the RE benchmark is not zero when using Mann’s PC method on autocorrelated data; it also suggests that you could get a completely artificial hockey stick on tree ring data that has long-memory structures, but in our GRL article we don’t press that point, instead we show that the particular effect in MBH98 was to select the bristlecones for dominant weighting in PC1.
JerryB

Posted Mar 28, 2006 at 10:25 AM | Permalink

“disaggregating the issues” as in tutoring TCO, again.

Your NAS panel reference seems inappropriate. If someone on the NAS panel had asked the questions, the answers would be in the records of the hearing, a very different outcome than your asking in this blog, were many of the answers have already been posted, some possibly in response to questions you asked months ago.

Otherwise, if anyone has a link to the article that is the main subject of this thread, please post it. Thanks in advance.
TCO

Posted Mar 28, 2006 at 10:45 AM | Permalink

Jer, I’m not looking for tutorials per se. My interest is not purely studentish but is also of a nature of argument/discussion to make sure that we are being very clear about what is asserted, whether it is germane to the issue at hand, etc.

Ross, I will read your comment in detail and will likely requerry or even restate a (slightly) critical comment about the particulars of how you criticize Mann.
John Hekman

Posted Mar 28, 2006 at 10:51 AM | Permalink

JerryB

Here is the link.

http://www.newscientist.com/channel/earth/mg18925431.400-climate-the-great-hockey-stick-debate.html

“New Scientist” seems pretty wacky. Every article is a scare-fest. Run for the bunkers. They could not give a level-headed report on McIntyre and McKitrick if they wanted to, because they would lose their entire readership or have their offices blown up.

What bothers me much more is the cover of Time magazine, which I thought was a news magazine when I was in college decades ago. Now it seems to be the National Inquirer.
Jim Erlandson

Posted Mar 28, 2006 at 10:53 AM | Permalink

#48: The article is previewed here but requires a subscription for the whole thing.
Article Preview
Climate: The great hockey stick debate
jae

Posted Mar 28, 2006 at 10:56 AM | Permalink

Dano: did you see #42 and read the link? Any better ideas?
JerryB

Posted Mar 28, 2006 at 11:42 AM | Permalink

Thank you John and Jim.

TCO,

Yes, you are not looking for tutorials per se. You are peppering Steve, and/or Ross, with comments which suggest that you think that you are sufficiently knowledgeable to “make sure that we are being very clear about what is asserted, whether it is germane to the issue at hand, etc”, while also peppering them with questions that suggest otherwise.

Something about that combination seems a bit much.
kim

Posted Mar 28, 2006 at 11:43 AM | Permalink

jae: Pebble bed nuclear reactors which the Chinese will market worldwide and recover the exhausted pebbles.

Where should unrecyclable waste go? Deepest oceanic trenches.

Or ionic ‘cold, but really very hot’ fusion.
===============================================
TCO

Posted Mar 28, 2006 at 1:35 PM | Permalink

Let’s go one at a time. “Flipping” is listed as a criticism of the Mann method. I want to know what effect it had. If we can’t quantify it, then it’s not science.

1. Which series were “flipped” in the generation of the hockey stick.
2. How much of an effect (numerically, using the MM definition) did this change the hockey stick index…for the hockey stick itself.
Steve McIntyre

Posted Mar 28, 2006 at 2:15 PM | Permalink

#55. TCO, I’ve continually said that we’re not running a beauty contest for MBH faults. They have a bad method and the bad method interacts with the worst proxies. But we say time after time that it is not the method simpliciter but that it is still a lousy method – and not just the PC step.

If you take out the worst proxies, you get a different answer. If you change the method – e.g. a mean of standardized series, you get a different answer. WE’ve also talked about statistical significance and robustness.

As to “flipping”, that’s one property of the method – as a method. We didn’t say that this was THE big problem in MBH – just that this was one aspect of lousy methodology. There are some series that are flipped, but they’re not a big deal in the North American network. In your terms, this would appear to be a “thank god” problem. But it’s not. The same bad behavior of the method that causes the flipping also causes the method to overweight hockey stick shaped series and, in this case, to put bristlecones in the PC1. Thus the MBH PC1 is imprinted by proxies that are known to be problematic.

in terms of practical impact, your first example is more appropriate: say you have 10 series, of which there is 1 hockey stick and 9 straight lines. Apply MBH to it and you get the hockey stick back.

The MBH hockey stick shape is imparted to it by the bristlecones. Take the bristlecones out, you don’t get a HS. Does that mean that THE problem is the bristlecones, rather than the method. No, the method is crappy too. The bad data is highlighted by the bad method. I don’t know what else I can say.

Then they say, well, we can still “get” a HS other ways. But there’s always a problem in the other studies as well. The bristlecones occur in many other studies.

Also their methods enable a couple of HS series to imprint small-subset averages.

For example, let’s say that you’ve got one HS series with a decent-szed blade and 9 white noise series and average them. You’d think: well, the blade is going to get averaged down by the white noise, so if there’s a blade in the final reconstruction, it can’t just be from one or two series. But think about the re-scaling trick in the small subset series. THey do the average and then say, oops, the variance of our reconstruction doesn’t match observed variance. So they blow up the variance in the instrumental period and, lo and behold, you pretty much get the blade of the one HS series back.

If they were truly interested in independent studies, why would they use bristlecones in so many studies: MBH, Crowley and Lowery (2 of 13); Esper et al 2002 (2 of 14); Osborn and Briffa (2 of 14) etc. But it’s not just bristlecones. There are a few other series that recur time after time: Dunde, Yamal, Sol Dav.
Brooks Hurd

Posted Mar 28, 2006 at 2:52 PM | Permalink

Mann must have realized what he was doing when he performed his “interesting” PC analaysis and his unusual re-centering. This was highly unlikely to have been done by chance. Therefore, he should have expected to be challenged by for his methodologies. I can’t imagine that he was so arrogant to have assumned that everyone would just accept such odd applications of statistics. Why then does he not have a better defense for what he did?

It is now 8 years since MBH98 was published. Mann is still resorting to obfuscation, innuendo and ad hominem arguments.
TCO

Posted Mar 28, 2006 at 3:20 PM | Permalink

Steve,

1. I think that beauty testing IS relevant. This is the nature of the Von Storch comment. That the flaw exists, but the effect is small. Note, that even if VS was WRONG, that’s not the nature of my point. My point is that it is RELEVANT to characterize the extent of an effect. We can differentiate between things that change the shape of the graph from those that show poor practice. And we gain useful information from doing so. And we gain information by quantifying things.

2. Just to nail this down. Would you agree with the following statement: You don’t know the percent of HS index (I’m going to assume that you would if you did) that is created by inappropriate flipping in the HS (itself), but you think it is not substantial (wouldn’t change the impact of the graph). BTW, what would you guess? Less than 50%, 10%, 5%, 1%?

3. What if the bristlecones are really great proxies? what if we take a time machine and see how great they are. Still, the method would be overcounting them, no? So, yes. I think you can disaggregate how much of the effect is from the method itself. Imagine doing a “proper” method with the same data versus doing the Mannomatic. The difference in HS index shows the impact of the method itself. This is the whole point of the Burger paper.

4. In your example, why do you use white noise, versus flat lines?

5. “THey do the average and then say, oops, the variance of our reconstruction doesn’t match observed variance. So they blow up the variance in the instrumental period and, lo and behold, you pretty much get the blade of the one HS series back.”
a. I don’t quite follow this.* The variance of the data itself? Or the correlation to instrument? Variance of the data itself would be higher in the blade (instrument) time period, so why (and how) would you “blow that up”?
b. But I assume that this is a seperate methodological complaint from flipping, no? Shouldn’t we adress the point in question vice throwing in other flaws? It’s almost like Mann not responding to the question about r2, if you respond in a flipping discussion with other faults that are not flipping.

*Sorry, this is (maybe) a studentish question. but I want to be precise.
jae

Posted Mar 28, 2006 at 3:21 PM | Permalink

I can’t imagine that he was so arrogant to have assumned that everyone would just accept such odd applications of statistics. Why then does he not have a better defense for what he did?

But for Steve, the error may not have been noticed for a long time, maybe forever. It “worked” for 8 years!
TCO

Posted Mar 28, 2006 at 3:27 PM | Permalink

I think you have to be clear what you mean when you say “the” error. The most prominent, notable methadology issue has to do with the “off-centeredness”, no? Is it possible that this could have just been a coding/logic mistake?
Steve McIntyre

Posted Mar 28, 2006 at 3:34 PM | Permalink

#60. The off-centeredness could have been a coding/logic error. That’s a possibility that we raise in our articles. I’ve always been amused by the difference in an academic and business response to an error. A business man would rather look stupid than admit that he made a misrepresentation – think of all the smart businessmen who can’t seem to remember their birthdays on the witness stand. It seems that academics would rather say that they misrepresented things than admit that they made an error i.e. Mann would rather say that he misdescribed his methodology than that he didn;t know what he was doing.
Neil Fisher

Posted Mar 28, 2006 at 3:46 PM | Permalink

Re:#17. It’s all very well to suggest that we should all take the time to smell the roses, use public transport and so on. And if you happen to be lucky enough to live in the industrialised world, it’s certainly do-able. But all the people who argue for this seem to miss the point that by almost any measure you care to use, quality of life pretty much boils down to how much energy you consume. By not using your air conditioner as much, you are saving some small amount of energy, yet that fairly small amount of energy (from your perspective) would be a *considerable* amount of energy for the roughly half of the worlds population that manages to survive on less than US$2 per day. In that light, perhaps it would be better to spend your efforts on feeding, clothing and housing these people, or, from an environmental POV, fixing up the mess created by multi-national companies that supply you (and me) with “cheap” goods produced in countries where environmental concerns rank lower for most than simply putting food on the table, and a roof over heads.
kim

Posted Mar 28, 2006 at 3:52 PM | Permalink

Quality of life is tradionally, but not necessarily, connected with consumption of energy. Solar alone could theoretically support quadrillions; how measure the quality of that life to that of some smaller number?
=========================================
Steve Sadlov

Posted Mar 28, 2006 at 4:04 PM | Permalink

Dano, the truth in and of itself is the goal. This is philosophical. Let the truth emerge!
Steve McIntyre

Posted Mar 28, 2006 at 4:18 PM | Permalink

#58. Actually now that I think of it, there’s some amusing flipping in the regression stage. Mann has 11 long instrumental temperature series as “proxies”. Under the overall Mannomatic, about half of the series have negative weights – so an increase in 18th century at, say Trondheim (by memory) results in a lowering of the 18th century NH index.

The difficulty with some of the questions is that I think that you’re assuming that there is “right” method of extracting a temperature history from a pig’s breakfast of wonky data, with no effort to exclude precipitation series. I can probably guess as to the impact of different procedural steps relative to taking the mean of all the series, or relative to taking some kind of geographically weighted average (I do’t propose to do this at present), but I can probably guess pretty accurately.

Just guessing for now – I’ve looked at it in the past, but don’t have the numbers handy, I’ll check this at some time. Let’s say that the banchmark would be geographically weighted weights. Let’s first think about the North American tree ring network. Let’s hypothesize that these are all decent proxies with a bit of noise in them. Under geographical weighting, the bristlecones and Gaspe would on a combined basis represent no more than 3-4% of the area and that the North American tree ring network is about 25% of the NH area. So these two proxy series – the bristlecones collectively plus Gaspe sohuld account for about 1% of the total weight in the 15th century step.

My guess is that overall Mannomatic gives them about 30-50% of the total weight i.e. an inflation about 30-50 times of their geogrpahic weight. That’s why they imprint the final series so much. The blade inflation argument is relevant (and not a change of topic) because you can have 100% of a blade effect contributed by series with only about 20% of the weight.

Could have done flat lines, noise seemed like a better approximation. But it depends on the audience. The flat lines aren’t a bad idea for a general presentation.

Can’t deal with all your points right now.
TCO

Posted Mar 28, 2006 at 4:42 PM | Permalink

Ross,

1. I put some effort into my questions and would prefer answers to the particulars. (Just as you would expect with Mann.) You say that your opponents won’t engage you on specficics, but here you have a proponent engaging you and it seems that you don’t want to address specifics. Humor me and go through it point by point. If you think I’m missing an important tangential issue, fine point it out. But at least address the first point (if you really want to dig into things–care about truth etc.) I mean this beats debating Tim Lambert on his tendentious points, no?

2.

The order of PCs matters for interpreting them. I find it easier to get intuition algebraically rather than geometrically. Take an nxk matrix X. Suppose you want to approximate all k columns of X using a single vector q and a set of scalars a(1), … ,a(k). Your approximation to X will be [a(1)*q, a(2)*q, … a(k)*q]. This will leave you with an nxk matrix of residuals, call it E. If you choose q to minimize the trace of E’E, subject to the norm of [a]=1, then q will be the first PC of X. Now do the same operation on E, and the first PC of E is the second PC of X. Your approximation to E leaves a matrix of residuals F, and the first PC of F is the 3rd PC of X, and so forth. That’s why it matters if the hockey stick is in PC1 or PC4. PC4 is the stuff left over after the top 3 dominant patterns are accounted for, in an exercize ultimately aimed at identifying the one dominant pattern. In MBH99, they talked about the NOAMER PC1 as being essential to the overall results. If they had been talking about the NOAMER PC4 being essential, they’d have been laughed off the page. If your results based on PC1-PC3 give you one conclusion, then everything is reversed by adding in PC4, it doesn’t establish that PC4 really is the dominant pattern, since that contradicts the underlying definition of the PCs. What it shows is that you don’t have a very robust method.

The hockey stick itself is not merely a PC, it is a weighted average of the underlying proxies. The weights are determined by the PC process that groups proxies, and the regression-like operation that maps them to temperature data. The Mannomatic PC operation takes the bristlecones in the NOAMER network and boosts the weight on them, piling them into the PC1, thereby identifying them as the dominant pattern of variance. By inflating the eigenvalue associated with the hockey stick PC it suggests that they represent a strong, dominant signal in the data. The order of PCs matters for interpreting them. I find it easier to get intuition algebraically rather than geometrically. Take an nxk matrix X. Suppose you want to approximate all k columns of X using a single vector q and a set of scalars a(1), … ,a(k). Your approximation to X will be [a(1)*q, a(2)*q, … a(k)*q]. This will leave you with an nxk matrix of residuals, call it E. If you choose q to minimize the trace of E’E, subject to the norm of [a]=1, then q will be the first PC of X. Now do the same operation on E, and the first PC of E is the second PC of X. Your approximation to E leaves a matrix of residuals F, and the first PC of F is the 3rd PC of X, and so forth. That’s why it matters if the hockey stick is in PC1 or PC4. PC4 is the stuff left over after the top 3 dominant patterns are accounted for, in an exercize ultimately aimed at identifying the one dominant pattern. In MBH99, they talked about the NOAMER PC1 as being essential to the overall results. If they had been talking about the NOAMER PC4 being essential, they’d have been laughed off the page. If your results based on PC1-PC3 give you one conclusion, then everything is reversed by adding in PC4, it doesn’t establish that PC4 really is the dominant pattern, since that contradicts the underlying definition of the PCs. What it shows is that you don’t have a very robust method.

So is your concern that the bristlecones end up in the “wrong” PC or that the overall reconstruction (the graph) has a disproportionate amount of bristlecone? As to the latter, how much HS index would the ‘cones give with simple averaging and how much do they end up giving when the Mannomatic is used? I’m asking specifically in reference to the HS graph itself. (not PC1).

3.

The flipping issue arises because the PC algorithm will add a minus sign to one of the a(.) coefficients if doing so reduces the trace of E’E, even if the physical interpretation of that makes no sense. If half the proxies trend up and half trend down, and the physical argument is they should all trend up, then common sense tells you they’re probably not well-behaved proxies. Take a PC1 from that group and the PC1 will trend up (or down) simply because it puts – and + signs in as needed to line them up. But that only masks the fact that the proxies are not well-behaved.

I get that. My question was to what extent it changes the results. If its an effect, an error, something that changes the answer from what it should be, we should be able to quantify it.

4.

27 C3… as Steve says, we showed in our NAS talk what straight averaging looks like. The MBH method groups the data and then maps to temperature. There is an infinite number of ways to do it. You can get a HS without PCs, as long as you let the bristlecones dominate at the mapping step. But then you need to justify letting the bristlecones run the results. That’s one sin in the Mannomatic PC method–it gives a false justification for putting dominant weight on the bristlecones.

I say again, how much would the HS (final graph) differ in “HS index” with the mannomatic versus a simple average?

5. The red noise argument is necessary for showing why the RE benchmark is not zero when using Mann’s PC method on autocorrelated data; it also suggests that you could get a completely artificial hockey stick on tree ring data that has long-memory structures, but in our GRL article we don’t press that point, instead we show that the particular effect in MBH98 was to select the bristlecones for dominant weighting in PC1.

IOW, “yes TCO, red noise is not needed to show the effect in concern (mining) but is related to a different criticism (handling autocorrelated data vice iid)”.
TCO

Posted Mar 28, 2006 at 5:23 PM | Permalink

#58. Actually now that I think of it, there’s some amusing flipping in the regression stage. Mann has 11 long instrumental temperature series as “proxies”. Under the overall Mannomatic, about half of the series have negative weights – so an increase in 18th century at, say Trondheim (by memory) results in a lowering of the 18th century NH index.

I agree that this is a dramatic example of the process flipping something in the network in the non-physical direction. Unless, Mann says that certain places would get colder with a macro-change in climate (his teleconnection argument) about global proxies versus local ones. It is interesting to wonder if the process could do things to correct for being on the wrong side of the U or the like. Of course if the process always flips things, then how do we differentiate from unacceptible flipping and good flipping? I still think that the “laughable” issue is more related to a general comment on “what sloppy guys these guys are” or to “be careful of using this method with any other problems”. However, if you don’t quantify the extent. Or if you do and find that the impact is minimal (on hockey stick index of the final graph) then it is reasonable for Mann or VonStork or ME! to say that the error does not appreciably change the result. (note that this does not VALIDATE the result…there could be other things wrong with it…but they would be OTHER THINGS…not this thing…so it would then be fair to say, that this error “didn’t change the answer”.)

The difficulty with some of the questions is that I think that you’re assuming that there is “right” method of extracting a temperature history from a pig’s breakfast of wonky data, with no effort to exclude precipitation series.

No, I’m not. Give me some credit. I won’t make tendentious Ammanian comments about you making your own reconstruction. So please don’t say that because I want to get a better understanding and to evaluate criticisms one by one, that I’m validating the kvetches that I’m not addressing at that exact moment.

I can probably guess as to the impact of different procedural steps relative to taking the mean of all the series, or relative to taking some kind of geographically weighted average (I do’t propose to do this at present), but I can probably guess pretty accurately.

I’m not trying to write your workplan, but my questions are reasonable and are the way to quantitively address the “does it make a difference in the promotion” concern versus the “is it laughable” or “is it poor practice for future work” concerns.

Just guessing for now – I’ve looked at it in the past, but don’t have the numbers handy, I’ll check this at some time. Let’s say that the banchmark would be geographically weighted weights. Let’s first think about the North American tree ring network. Let’s hypothesize that these are all decent proxies with a bit of noise in them. Under geographical weighting, the bristlecones and Gaspe would on a combined basis represent no more than 3-4% of the area and that the North American tree ring network is about 25% of the NH area. So these two proxy series – the bristlecones collectively plus Gaspe sohuld account for about 1% of the total weight in the 15th century step.

My guess is that overall Mannomatic gives them about 30-50% of the total weight i.e. an inflation about 30-50 times of their geogrpahic weight. That’s why they imprint the final series so much.

A. You get a much more dramatic result (in your “favor”) with geographic weighting than with simple, no? What would be the same numbers if you did simple averaging vice geographic? It seems like the bigger issue would be geographic weighting or not, no? Is geographic weighting a part of the Mannomatic (you never answered that).
B. (This is a segue from my disection) I wonder how geographic weighting should be handled. Intuitively, if I have one sample on one continent and 100 on another, I would not want to give the two sub-averages equal weight, no? Or should I? If I had 100 samples on one continent ant 10,000 on the other, I might want to just average the two subaverages, no? I assume that the noise has an issue with how we handle this, no? Seems like a simple stats issue. How should we deal with it?

The blade inflation argument is relevant (and not a change of topic) because you can have 100% of a blade effect contributed by series with only about 20% of the weight.

My bigger issue was not following your comments. Not following exactly what happens. In terms of relevance, sure variance “blowing” is a relevant criticism of the Mannomatic as a whole. It is not relevant to quantitative description of “flipping”.

Could have done flat lines, noise seemed like a better approximation. But it depends on the audience. The flat lines aren’t a bad idea for a general presentation.

I think the “red” noise was particularly bad, because (if your “mining concern from off-centeredness” does not depend on red noise) someone might mistakenely confuse the issue of offcenteredness-data-mining with autocorrelation concerns.

Can’t deal with all your points right now.

That’s ok. Just don’t listen to JerB.
James Lane

Posted Mar 28, 2006 at 5:30 PM | Permalink

Re #28. Steve, a poster over at Lambert’s place made the following comment:

” I know that some people try to ask repeated questions [at RC] based on incorrect information originating from McIntyre, such as the idea that the data in the ftp://holocene.evsc.virginia.edu/pub/MBH98/TREE/ITRDB/NOAMER/BACKTO_1400-CENSORED directory only has bristlecone pine proxies removed when it actually has the entire North American tree ring data set and Queen Anne data set removed. This data made up 70% of all of the proxy data used by MBH98 prior to AD 1600 and was removed from the MBH98 dataset by McIntyre and McKitrick in their original 2003 E&E paper.”

Could you clarify?
Steve McIntyre

Posted Mar 28, 2006 at 7:15 PM | Permalink

I don’t mind explaining. The inability of many intelligent people to understand how goofy the MBH and other multiproxy studies (which seems so clear to me) means that I could be explaining things better.

There’s no geographic weighting in the Mannomatic. The effect is more marked with geographic weighting as there are a lot of bristlecone series from a small area. But with simple averaging, you still don’t get a HS – see the NAS panel presentation.

Update – let me clarify a little. In the MBH regression stage (but not the tree ring PC stage), there are weights but it’s imposible to say that they are geographic weights. I don’t know what they are. A&W do calculations without any weights. There are some very odd QC issues. For example, there are 4 different series from Quelccaya ice cores used. Instead of averaging two different O18 series, they are used individually; same with two accumulation series. These are all in the small MBH99 network. The Gaspe series is used twice – once in the North American tree ring network, once individually. Series from Spruce Canyon CO are used 7-8 times – the EW and LW series are used in the Stahle/SWM network; then a near duplicate versions are sued (the first 100-125 values are identical); then the RW and MXD series are used in the NOAMER network; then the MXD is used in the Briffa composite and I think that the RW is used in the Fritts composite. In my grandmother’s trunk, there is a…
TCO

Posted Mar 28, 2006 at 7:30 PM | Permalink

Steve, I guess then one of your criticisms of the Mannomatic ought to be the LACK of geographic weighting itself.
mark

Posted Mar 28, 2006 at 7:55 PM | Permalink

That’s another interesting problem. If the weights for the bristlecones are so dominant, how can anyone make a claim to “global” representation?

Mark
Steve McIntyre

Posted Mar 28, 2006 at 7:57 PM | Permalink

the idea that the data in the ftp://holocene.evsc.virginia.edu/pub/MBH98/TREE/ITRDB/NOAMER/BACKTO_1400-CENSORED directory only has bristlecone pine proxies removed when it actually has the entire North American tree ring data set and Queen Anne data set removed.

The AD1400 step has 95 potential series, of which 70 are grouped into the North American tree ring network (which has no fewer than 20 bristlecone series, some of which are only a few miles from one another.) The North American tree ring network is made into 2 PCs, the PC1 “effectively” excluding all series except bristlecones through the PC weighting.

The CENSORED directory is Mann’s and contains PC series calculated using the 50 of the 70 series, without bristlecones. It doesn’t say this and the series are not even identified, but I figured it out. The PCs in this directory do not have a hockey stick shape, showing that without the bristlecones there is no HS shape in this network. If you went forward with an MBH98-type calculation (and their multivariate method is by no means a good one or “truth”), using these PCs with the other AD1400 proxies gives high 15th century values.

Mann’s PC method had two undisclosed aspects – the decentring oddball PC calculation and a stepwise change of components – neither of which is “conventional”. The “conventional” PC method is to do PC calculations over the maximum period for which all series are present. We asked Mann for methodological clarification prior to MM03, which he refused to give. So MM03 did PC calculations using the maximum available period. The bristlecones were not “thrown out” in this calculation, but by using a conventional PC method, the North American PC1 was not available in the 15th century network for the MM03 calculation.

For the Gaspe (Ste Anne) series, we used the archived version of the series. Under Mann’s stepwise methodology, a series had to be present for the entire period to be used. It turned out that Mann had done an unreported extrapolation of the Gaspe series to get it into the AD1400 network – the only such extrapolation out of 415 series. We did not “remove it”. Mann had diddled it to get it into the AD1400 calculation (since it had a hockey stick shape). We simply used the archived version and let the chips fall where they may.

No series were ever “removed”. However, in MM03, by doing a conventional PC method, the bristlecones did not enter into 15th century calculations and this caused a difference in results. Remember that Mann had said that his results were “robust” to the presence/absence of all dendroclimatic indicators, so they hsould have been robust to removal of 70 US tree ring series, much less the bristlecones.

More info spilled out after MM03. Mann has sought to conflate the stepwise issue with the decentring issue (and the bristlecones). When we re-did the calculations with a stepwise method and centered calculations, we still got high 15th century values, so it wasn’t the stepwise method itself that caused the difference. After MM03, Mann made the FTP site for MBH98 publicly accessible and it turned out that there was a bit of Fortran code for the tree ring PC calculations. I parsed through that and found the decentring problem, which explained the divergent results.

No series were ever removed. However, Mann has repeated this so many times that he probably believes it. But even if they were, it shouldn’t have mattered given his warranty of robustness. And even more, the MM05 emulations completely replicated the MBH stepwise methods. The differeences have nothing to do with the 50 other North American tree ring series; they have only to do with the bristlecones. Mann knew of the lack of robustness to bristlecones, but nonetheless claimed robustness to presence/absence of all dendro indicators. You never hear them even try to explain their way out of this canard.
Ross McKitrick

Posted Mar 28, 2006 at 10:27 PM | Permalink

#66

1. I put some effort into my questions and would prefer answers to the particulars. (Just as you would expect with Mann.)

Well actually I wouldn’t expect answers to the particulars from Mann…

it seems that you don’t want to address specifics

Bzzzt. I have written various summaries precisely to walk readers through them. You put a scatter of small postings up on this thread and I lost track of your points. Feel free to summarize remaining issues, and bear in mind that I may not respond for hours or days because of my day job.

A So is your concern that the bristlecones end up in the “wrong” PC or B that the overall reconstruction (the graph) has a disproportionate amount of bristlecone? As to the latter, C how much HS index would the “cones give with simple averaging and D how much do they end up giving when the Mannomatic is used? I’m asking specifically in reference to the HS graph itself. (not PC1).

A: Conventional PCA (on the cov matrix) would assign the bristlecone pines (bcp’s) to PC4. Mann’s method assigns them to the PC1. In that sense it puts them in the “wrong” PC. B If they were in the PC4 they would still influence the final reconstruction just as much as if they were in the PC1, since that stage of the method doesn’t care which PC it is. That, however, is just a consequence of the least squares method, and doesn’t establish that they should get so much weight. Statistics can’t tell you whether the weight is disproportionate or not–that has to be decided on physical grounds. Our reading of the evidence is that bcp’s are widely considered to be contaminated by non-temperature effects in the 20th century and should therefore get minimal or no weight. The fact that they get so much weight in MBH is disproportionate. C You mean, if you do a simple average with/without the bcp’s, what’s the difference? None, I think. A simple average isn’t a HS. See our NAS presentation. The simple mean of 20th C proxies slopes down. I suppose if the bcp’s are removed it would slope down a bit faster, but I doubt it would make much difference. D That’s discussed in our EE05 paper. The Mannomatic puts them in PC1 and guarantees they make the starting lineup. If they get dropped to PC4 but stay in the regression model there’s still a hockey stick–however then you have to justify a conclusion that pivots on inclusion of PC4(8%) from a single network. The issue then is robustness.

If its an effect, an error, something that changes the answer from what it should be, we should be able to quantify it.

It’s intrinsic to PCA so it’s not an “error” if what you want is PCA. If on physical grounds you don’t like your computer putting in minus signs willy-nilly, then you don’t want PCA. You can’t just go in and arbitrarily remove the minus signs though, because then you don’t have PCs any more. So you can’t properly quantify the influence, since it would be against a counterfactual that has no clear interpretation. In EE05 we reconcile EE03 to MBH98 in the most relevant steps, but don’t include a “non-flipping PC”, because that would be flippant, and readers would tire of us fillipping the flipping graph.

I say again, how much would the HS (final graph) differ in “HS index” with the mannomatic versus a simple average?

See simple average in NAS. Maybe Steve will send you the average series and you can do some regressions and tell us.
Willis Eschenbach

Posted Mar 28, 2006 at 10:59 PM | Permalink

Re 69/72, Steve, that has to be the tightest, clearest explanation of (some of) the problems with the hockeystick that I have seen. It should be part of a (sticky) FAQ up at the top of the site …

w.

PS – Preview not working for me either.
fFreddy

Posted Mar 29, 2006 at 12:57 AM | Permalink

Re #73, Ross

If they were in the PC4 they would still influence the final reconstruction just as much as if they were in the PC1

Ross, when you have a moment, this point could do with some clarifying. On the face of it, if there is no difference between being in the PC1 or the PC4, it seems kinda strange.

TCO, questions that improve understanding are good questions. Go for it.
James Lane

Posted Mar 29, 2006 at 3:05 AM | Permalink

fFreddy, it is strange, and that’s one of M&M’s points. It’s something of a judgment call as to how many PCs to retain for the next step in the analysis. Via the Mannomatic, the bcps load on PC1, explaining 38% of the variance. Obviously you’r going to retain PC1 (otherwise you wouldn’t have any data to work with). Via convential PCA, the bristlecones load on PC4, which explains 8%.

Mann argues that even under “conventional” PCA, the PC4 should be retained for the regression stage. There is no hard and fast rule as to how many PCs should be retained, although there are “guidelines” (scree test, eigenvalue rule, Preisendorfer). In a post hoc explanation, Mann claims Preisendorfer to admit the PC4. But the subsequent regression stage, as Ross says, doesn’t account for the variance explained by each component.

Steve has written extensively on this here:
James Lane

Posted Mar 29, 2006 at 3:06 AM | Permalink

Oops:

http://www.climateaudit.org/?p=296#more-296
TCO

Posted Mar 29, 2006 at 6:32 AM | Permalink

I think I’m getting somewhere with pinning you down, Ross.

Complaining which PC something is a more minor complaint than a change in the overall stick index in the overall reconstruction (stick index). I think it’s worthwhile that I called you on it. I bet there were plenty of people who didn’t realize the subtlety in your kvetch.

As far as which PCs get incorporated, that is a different issue and starts to become metaphysical since you all don’t really agree with the PC method itself.

Segueing: I really don’t see why one should leave any of the information out via PCA and exclusion of lower order PCs.
TCO

Posted Mar 29, 2006 at 6:59 AM | Permalink

It’s intrinsic to PCA so it’s not an “error” if what you want is PCA. If on physical grounds you don’t like your computer putting in minus signs willy-nilly, then you don’t want PCA. You can’t just go in and arbitrarily remove the minus signs though, because then you don’t have PCs any more. So you can’t properly quantify the influence, since it would be against a counterfactual that has no clear interpretation.

If you are complaining about the effect than it must be because it changes the answer from what it should be. (If it doesn’t change the final answer, then just say that…we can progress then to wondering why it’s a methodological error if it has zero effect. If it has a non-zero impact on the answer, then that impact must have a quantity.

P.s. I guess you’re being cute, but I meant “expect” in the sense of “proper behavior deserved” versus “anticipated action”. Anyhow, I hope you hold yourself to the same standards that you demand rather then those that your opponents evidence.

P.s.s. Thanks Fred…go tell Jer.
Nicholas

Posted Mar 29, 2006 at 7:21 AM | Permalink

TCO, there can be a methodological error in a technique which, given a particular set of data, has no actual effect on the outcome.

For example, take MBH98. If you believed that tree rings, etc. were a good proxy for temperature but only in a positive sense (i.e. hotter = wider/denser rings), then you could complain that the MBH98 method will happily flip trees which have a *negative* correlation to temperature and use them as part of the signal. If you believed that there were only negative, linear responses in the proxies then that would be introducing errors into the signal you’re extracting. (Whether or not it’s the case we’ll leave for another discussion).

So obviously given *some* sets of data this will lead to a wrong conclusion. But, given other sets of data (where the strongest correlation is with proxies that have a positive response as expected), it will make no difference. That does not mean it is not a flaw with the method. It just means that with that particular set of data, the flaw does not affect the outcome.

Frankly I’m having a hard time understanding your tone here and the type of accusations you are making. It seems to me all this has been made clear in the past at this blog. The arguments are somewhat technical in nature (some posts I don’t understand at all!) so it’s reasonable that you missed some of them or didn’t understand them. But I don’t think you should blame your misunderstandings on Steve or Ross. They are correct in pointing to flaws in the method *even if they do not affect the results of the published studies* because if one were to use the same method on different sets of data those flaws could in fact affect the result. Therefore it is a flawed method, regardless of whether it accidentally comes up with the correct result some of the time.
James Lane

Posted Mar 29, 2006 at 7:32 AM | Permalink

TCO, love you to bits, but you have the wrong end of the stick on the “flipping” issue.

First, I think Steve and/or Ross have already explained that it’s not really germane to their criticisms of MBH.

Second, “flipping” is a basic feature of PCA. It’s not something diabolical, it’s everyday stuff in applied social psychology.

Consider a “personality assesment” questionnaire with the following agree-disagree statements (among many others):

1. I enjoy meeting new people.
2. I am uncomfortable in social situations.

Using PCA, it is not surprising to find that these statements load on the same principal component, albeit one with a positive sign and one with a negative sign. But we can agree that they both contribute to some underlying dimension that might be described as (and I’m being hypothetical) “Sociability”.

PCA doesn’t care if the correlation (or covariance) is positive or negative, it only cares about the strength of the relationship.

That’s when PCA becomes problematic with something like alleged temperature proxies. One series can go up, the other down – PCA doesn’t care, it will load them on the same PC, and unless you look at the actual loadings, you would never know.

As I said, I don’t think this point is particularly important to M&M’s arguments (they might disagree), but that is how it works. What it does suggest is that PCA is not really a good way to deal with a host of divergent proxy series.
TCO

Posted Mar 29, 2006 at 8:18 AM | Permalink

Nicolas: I agree that there might be methodological flaws which don’t evidence themselves under certain data. And I think it is useful to point out such errors (to prevent the method from being used in the future, to show what poor thinkers the method creators were, etc.).

But, I also think it is value-added to clarify which flaws affect the “hockey stick graph” materially. If someone gets the mistaken impression that a general method criticism “changes the story” when it doesn’t, then that’s a bad inference.

Ross: If you are pointing out a flaw from the method (improper flipping) then surely this can be quantified for a given case (or for hypothetical cases, if you are more interested in criticizing the method as a method).
Ross McKitrick

Posted Mar 29, 2006 at 9:43 AM | Permalink

TCO,

I think I’m getting somewhere with pinning you down, Ross.

Re-read E&E05, Section 3. From my perspective, I’m just elaborating on things stated in print last year. Something like this takes a lot of explaining and re-explaining to iron out all the nuances.

Complaining which PC something is a more minor complaint than a change in the overall stick index in the overall reconstruction (stick index).

Sorry I didn’t follow that sentence.

Segueing: I really don’t see why one should leave any of the information out via PCA and exclusion of lower order PCs.

The whole point of PCA is to reduce the dimensions of a data matrix, ie ‘leave information out’. If you have 400 columns of data and you use all 400 PCs you’re no farther ahead. You have to decide how many PCs to drop. James explains this well in 76/77. In terms of MBH98, Mann never had to make a case for the bcp’s because they seemed to be the dominant pattern and appeared in PC1. If we concede for the sake of argument that PC1-PC5 ought to be included, and it had been known that the bcp’s show up in PC4, the natural question readers would ask would be, does inclusion of PC4 affect the results? Since the answer to that is, yes–it reverses the results, he would immediately have been challenged that one low-order PC in one network, representing at most a regional signal and, what’s more, drawn from contaminated data, cannot be the basis of your overall conclusions.

If you are complaining about the effect than it must be because it changes the answer from what it should be.

You’ve misunderstood how this point arose. In Steve’s posting he’s responding to New Scientist mischaracterising our argument as reducing to, in effect, ‘Mann’s method creates hockey sticks out of thin air.’ Steve’s response, in the paragraph beginning “The data mining from Mann’s PC methodology has been well-publicized, but people tend not to get the nuances right. You can get …” argues that, to paraphrase, yes you can get an artifact-HS from red noise because of the way Mann’s algorithm displaces the 20th century section and PC analysis lines up the results, but in the MBH98 context the main point is that his algorithm over-emphasized the bcp’s in the NOAMER network. If they are removed then his algorithm no longer differs from a regular PC method (see our reply to Huybers, paragraph 9). The ability to generate HS’s from red noise matters for computing the RE benchmark. We did quantify this, by showing how it moves the RE significance benchmark up to 0.51. You want quantification of a different issue, which is evidently of more interest to you than to us. We did show one example in E&E05 pp 77-78 of how the sign-flipping can lead to obviously incorrect results. Another example is that some instrumental temperature series in the MBH98 data base end up with negative weights in the final NH average. We don’t pursue these points any further, only to say that the statistical method does not guarantee physically plausible results.
Ross McKitrick

Posted Mar 29, 2006 at 9:47 AM | Permalink

#75 – fFreddy, from the PCA perspective there is a difference between PC1 and PC4, as stated in #47. In the regression-like fitting step, they’re just columns in the matrix, and that operation doesn’t care which column is which. The researcher needs to keep the information from the PCA in mind when evaluating whether the fitting step results make sense, though.
Steve McIntyre

Posted Mar 29, 2006 at 10:00 AM | Permalink

TCO – some of these issues depend on the context. The flipping issue came up with Ritson. Sometimes we get challenged on methodological points, sometimes on impact on MBH98; we don’t always get to set the agenda. We have to show everything – that there’s problems with the methodoogy, that it matters.

The flipping issue came up primarily with Ritson. He didn’t believe what we were saying about PC methodology in MBH98 and challenged us to synthetically increase early 15th century tree ring proxies by 0.5. I said – OK, but I’ll increase everything but the bristlecones by 0.5 and see what happens. The Mannomatic flipped over all the non-bristlecone series and lowered the index for the early 15th century. Ritson went ballistic and threw around words like Rathergate, pretty much accusing us of fraud.

So flipping has a context for us that may not be obvious to a reader. It wasn’t an issue that we made a big deal of; we didn’t mention this effect in GRL and only in passing in E&E. It was mentioned more as a methodological point than as a point with MBH impact, since people often ask (including yourself) whether PCA is the “right” method or what’s the “right” way to do things. Our point is that any method which does things like this can’t be the “right” method; we’e avoided giving prescriptions as to how things should be done. Personally, I don’t believe that you can just take a jumble of tree ring series, apply an unsupervised algorithm and get a magic answer.

As to quantifying various MBH errors, one of the things that you know by now is that there are many, many sloppy and incorrect things in MBH. Does the use of Paris precipitation data in a New England gridcell “matter”? Does the non-use of data said to be used “matter”? Most MBH errors don’t “matter”, but they evidence a total sloppiness. You don’t know in advance which errors “matter” and which ones don’t. And you have to plow through a lot of errors that don’t “matter”.

In terms of what “matters”, the main issue is the presence/absence of bristlecones and the weighting of bristlecones. If bristlecones are heavily weighted in the MBH98 data set, then you get a hockey stick; if bristlecones are not heavily weighted, you don’t.

Mann has developed a couple of new rationales for getting heavy weights for the bristlecones, but these new rationales are not “correct” methodology in the sense that you can go to Draper and Smith and see – aha, there’s the method. There are methodological choices made to heavily weight the bristlecones; other choices lead to different results.

Mann justifies the heavy bristlecone weighting on the grounds that he gets a good RE statistic with these weights – this leads into the entire discussion of r2 and RE.
TCO

Posted Mar 29, 2006 at 10:03 AM | Permalink

Ross, my comments/questions were directly related to statements made in this post/comments. Steve said:

The data mining from Mann’s PC methodology has been well-publicized, but people tend not to get the nuances right. You can get a hockey stick shaped PC1 from series in which there is no hockey stick shape in the underlying data. His method will pick out and overweight series with 20th century trends and flip the series so that the trends are all in the same direction,.

I want a quantification of how much this changes the hockey stick in MBH98. That is a valid “nuance” to want to know. If it’s worth complaining about, it’s worth quantifying. If it has a minimal effect IN THIS CASE, then I want people to know that clearly. It still might be a lousy method that we should avoid. It still might show what a putz Mann is. But we don’t want people to get the wrong impression that it changed the answer, if it didn’t. So we need to know the extent of the impact.
TCO

Posted Mar 29, 2006 at 10:09 AM | Permalink

The whole point of PCA is to reduce the dimensions of a data matrix, ie “leave information out’.

I GET THAT, ROSS. My comment (clearly labeled as a “segue”) was that I wasn’t clear what the justification for doing this type of thing is. My comment is a general concern with PCA. It seems that if the general point is to “leave information out” that you are doing something almost like clearing all the outliers out of a regression. I don’t know that, that is justified. Feel free to engage, with that as the thought-starter…
TCO

Posted Mar 29, 2006 at 10:41 AM | Permalink

I think that it’s important when looking at your criticisms of MBH methodology to quantify the effect of each listed “error” on the final result. I’m not stopping you from listing every error for the “embarrasment value” or from the “it could be worse in a different situation so don’t use this methodology”.

It sounds like the major cause of the bristlecone “overweighting” in the hockey stick has to do with the lack of geographic weighting, not with the offcenter PCA error. Correct?
Nicholas

Posted Mar 29, 2006 at 12:49 PM | Permalink

TCO, the offcenter PCA error is part of what’s responsible for the outrageous weighting of a very small geographical area worth of proxies. However, you could argue it would be less serious (although still a problem) if proper geographic weighting was part of the method. That is, the bristlecones would probably still bias the north-western America geographc region and produce incorrect temperature reconstructions in that region, but wouldn’t affect the overall result so badly, because that’s only a small part of the planet, whereas the MBH method allows a single proxy or small set of proxies to influence the “global” result. You could still validly criticize such a method as being incorrect, though, for various other reasons.

A similar thing could be said of the “flipping” problem. It is a problem, but the bristlecone/offcentering/weighting issues are so much worse that it’s not really worth mentioning in the context of MBH98. However, if you were to remove the bristlecones from MBH98, eliminate the splicing and remove the off-centered normalization step, then the “flipping” problem could become a relatively serious problem. However, it seems to me to be easier to simply come up with a new and better method from scratch, trying to avoid all the MBH problems, including the ones like flipping which are relatively small, but still problems.
TCO

Posted Mar 29, 2006 at 4:34 PM | Permalink

#75 – fFreddy, from the PCA perspective there is a difference between PC1 and PC4, as stated in #47. In the regression-like fitting step, they’re just columns in the matrix, and that operation doesn’t care which column is which. The researcher needs to keep the information from the PCA in mind when evaluating whether the fitting step results make sense, though.

So if both PCs are retained, it’s irrelevant (mathematically equivalent) which PC the bristlecones are in. In that case, I think it’s rather tricky/misleading to mention the concentration of bristlecones into the PC1 versus the graph overall. I’m glad I pinned you down.
James Lane

Posted Mar 29, 2006 at 5:25 PM | Permalink

TCO, no.

It IS relevant which PC the bcps are in and NOT mathematically equivalent. However the regression stage does treat the PCs as equivalent. This is why great attention needs to be paid to the PCA output (content and variance explained) PRIOR to the regression stage.

I have undertaken PCA followed by regression literally hundreds of times (in the social sciences) and it is an unbelievably labour intensive process. If you don’t pay close attention to what the PCA is doing with your data, you can easily end up with garbage in the regression stage.

When I first came across MBH, I was quite surprised to see PCA deployed in the the “hard” sciences, because there is quite a lot of subjectivity involved in the handling of the data and interpretation of the analysis.

Sometimes (most times?) in statistical analysis if you carefully follow a “recipe” you will end up with a reliable result. This is not the case with PCA – it requires a great deal of supervision.
TCO

Posted Mar 29, 2006 at 5:40 PM | Permalink

If it gets preserved and has the same mathematical impact on “hockey stick index”, then it’s irrelevant if it is in the PC4 or PC1. Mathematically…it is so. If you think that PC1 is more beatiful, then you need to weigh it more heavily, need to exclude PC4, etc.
TCO

Posted Mar 29, 2006 at 7:16 PM | Permalink

And “excluding a lower PC” is an extreme case of weighting of course.
Steve McIntyre

Posted Mar 29, 2006 at 7:16 PM | Permalink

TCO, it’s a little unfair to say that we should have quantified how much the flipping aspect of PC series contributed to MBH as opposed to the PC method itself. It’s not a bad question but you’re taking a couple of things out of context.

My remark about flipping in the post above was really meant in response to a very spectific statement by Mann to New Scientist about properties of his PC method. The article said:

There is one sense in which Mann accepts that this is unarguably true. The point of his original work was to compare past and present temperatures so he analysed temperatures in terms of their divergence from the 20th century mean. This approach highlights differences from that period and will thus accentuate any hockey stick shape if — but only if, he insists — it is present in the data.

Here we’re talking about methods, not the impact on the North American tree ring network. My response is couched in those terms. Look, I was a math guy and abstract properties of systems interest me just as much as impacts on tree ring networks. My posting said that Mann’s methodological point was untrue. If you think of one of the key features of a “hockey stick” as being a series with a closing trend and a markedly reduced amplitude in the shaft, then you can get a HS PC1 without any underlying HS series. That’s what I meant:

You can get a hockey stick shaped PC1 from series in which there is no hockey stick shape in the underlying data. His method will pick out and overweight series with 20th century trends and flip the series so that the trends are all in the same direction. In the shaft of the stick as you get away from the common 20th century feature, the noise cancels out. Since the PC1 is a weighted average of the various series, the noise features cancel out. The variance of the average is small in the shaft, but large in the blade.

In the context of the North American tree ring, network, the flipping is less material than the data mining for HS series. However, in all articles, we absolutely emphasized the data mining aspect. Other cross-analyses are possible, but my sense is that the cross-analysis is secondary.
Steve McIntyre

Posted Mar 29, 2006 at 7:23 PM | Permalink

#93. TCO, again, recall the context. After MM03, Mann (and ourselves) wre trying to figure out why the MM03 results were so different. He said that the MBH98 PC1 was the “dominant component of variance” (with a huge eigenvalue), so we wanted to figure out what was so special about the MBH98 PC1. The surprise was that, using a proper PC method, the HS shape of the bristlecones was in a lower order PC. In MBH98, he had only used 2 PCs in this period-network and that led to a different result (MM05b). We carefully reported on various permutations and combinations, noting that if you did the same calculation using centered PCs and 4 or more PCs, you got an MBH98-type result.

The issue then seemed to us, not so much how many PCs to retain, but that we’d isolated the “active ingredient” in the HS, which was the bristlecones. THen the issue was the validity of bristlecones. IT all seems like a pretty logical approach to me. Mann has attempted to defent on the basis that 5 PCs are the “right” number, but there’s nothing magic about this. Look at the posts from last Februray on this – from the MBH98 category.
TCO

Posted Mar 29, 2006 at 7:44 PM | Permalink

What matters is the overall HS, not the PC1. The HS that was shown was not a PC1. The graph shown was not a PC1. Even if you only keep 2 PCs (and then you have to fight/justify that…and you’re fighting in a direction you don’t believe in), the you should discuss the HS from 2 PCs, not the PC1. The graph that the public saw was not the PC1. For instance what if the PC2 fixes the concerns you have about PC1? Not saying it does, but still, PC1 is misdirection.

I don’t give a damn about the “context” of what you were talking about. When I ask a specific question, it’s a specific question. Answer that first. Then blather on about what you think I’m going to do in misinterpreting your response (if you must). but don’t lead with the rebuttal.
Bob K

Posted Mar 29, 2006 at 10:18 PM | Permalink

RE: 96
TCO,
Who died and left you in charge?

From the tone of you post, you seem to be of the opinion that Steve has to immediately answer to your every demand. Since you’ve been away from the site for some time, maybe you should review what you’ve missed. It seems rather overbearing of you to be telling Steve that he has to quickly answer your precise questions and in precisely the order you wish.

Don’t you think Steve has enough going on, that catering to your every whim amounts to a severe imposition on his time?

Maybe you’ll never understand. What a waste of his time that would be! If it’s that important to you, why don’t you take a course?

I’m of the opinion that Steve has been more than generous with his time and explanations. Especially since he’s not getting paid for it.
TCO

Posted Mar 30, 2006 at 6:45 AM | Permalink

My questions are reasonable ones. If we are going to discuss this sort of thing, I’m going to ask these questions. If someone says, look what a bunch of goobers the Mannites are, they made mistake X, then I will ask the impact of mistake X. If we don’t know (or it is small) then that puts the HS index less in doubt (at least from that cause). It means I can concentrate on the other causes/issues which do sway the answer. And it’s illustrative that others know this as well. What if you thiought flipping changed the answer when it doesn’t? You would be mistaken. And you wouldn’t be concentrating on the major issue.
Bob K

Posted Mar 30, 2006 at 10:51 AM | Permalink

RE: 98

I have nothing against asking questions. I wouldn’t even have commented if you hadn’t included the following as a parting shot in your post 96.

I don’t give a damn about the “context” of what you were talking about. When I ask a specific question, it’s a specific question. Answer that first. Then blather on about what you think I’m going to do in misinterpreting your response (if you must). but don’t lead with the rebuttal.

In my opinion, that paragraph exudes an overbearing attitude and unwarranted indignation. Asking questions is fine. Just drop the attitude. I and I’m sure many others, would consider such remarks tactlessly rude, especially considering you are the one asking for help.
TCO

Posted Mar 30, 2006 at 11:09 AM | Permalink

Ok.
Martin Ringo

Posted Mar 30, 2006 at 11:42 AM | Permalink

Re # 86 and 98
TCO, let me ask a question about your question. Is it correct that what you want to know is how much the final reconstruction is effect by the series that were flipped when the principal components were taken? That is, if we were using the MM hockey stick measure, you would like to know how much that measure changes if there were no flipping?
Ross McKitrick

Posted Mar 30, 2006 at 11:45 AM | Permalink

TCO, it is annoying is that you keep harrumphing about “pinning me down” when what I am doing is re-stating for your benefit points already made in our papers. You seem to think I’m being evasive, yet you ask questions that show you haven’t read (or haven’t understood) what we’ve already written.

It does matter which PC the bcp’s are in, if you depend on the fact that the “signal” is in PC1, not a lower PC. In MBH99 they talk about the ITRDB PC#1 at great length, including: “only one of these series–PC#1 of the ITRDB data–exhibits a significant correlation with the time history of the dominant temperature pattern of the 1902-1980 calibration period. Positive calibration/verification scores for the NH series cannot be obtained if this indicator is removed from the network of 12 (in contrast with post-AD 1400 reconstructions for which a variety of indicators are available which correlate against the instrumental record.) Though, as discussed earlier, ITRDB#1 represents a vital region for resolving hemispheric temperature trends, the assumption that this relationship holds up over time nonetheless demands circumspection.” (p. 761)

Now, if they had written all that but were referring to PC#4, the referees would surely have slammed on the brakes and asked them how PC#4 can “resolve hemispheric temperature trends” while PCs 1-3 from the same data do not, and when the PC4 loadings are all on a small group of trees from one region.

You are interested in the effects on the HS of the main data/methodological issues we’ve raised. This was shown in our E&E paper. You should read it some time. Our GRL paper is about PCs, hence the title. The elimination of the HS shape (and any claims of skill) upon removal of the bcp’s is not a point anyone disputes: see http://www.climateaudit.org/?p=205.

You are wrong about the importance of offcentering. In the regression model the bcp’s act as outliers that change what would otherwise be the results. See E&E05, pp 79-80, Figure 3, which quantifies the impact for you. The fact that the bcps appear in PC1 rather than PC4 is what justifies allowing them to play such an influential role. The bcp overweighting in PC1 is caused by the offcentering.

As for flipping, you keep bringing it up, so don’t make it sound like it’s our hobby horse. Your rant in #86 shows you didn’t read my reply in #83. The flipping issue arises for explaining how PCA can make a hockey stick PC1 out of red noise, which matters for the RE benchmarking question, not for diagnosing the HS shape of MBH98. You keep demanding a diagnosis of how it affects the HS shape. I doubt it has much of an effect, but I don’t know. If you’re so curious, you do it.
TCO

Posted Mar 30, 2006 at 12:22 PM | Permalink

Martin:

Yes. I would like to know how the final graph would differ (for simplicity the “hockey stick index”) if no flipping had occurred.

Ross: I don’t think that you ever gave me a response to 27.c3. And Steve’s answer was not directly responsive either. Also, I DO THINK that you are pulling a bit of a fast one with the comments on over-representation in the PC1 (versus over-representation in the overall graph). Because lots of people will not realize that PC1 is just an intermediate result. And that differing weights in PC1 is a different issue than differing weights in the actual “hockey stick graph”. It also quite a neat trick in allowing you to “bypass” the Preisendorfer’s “n” kerfuffle. So, I don’t back down on this. you haven’t given me any reason to do so.
TCO

Posted Mar 30, 2006 at 12:29 PM | Permalink

martin: “yes”

ross: I mean 27c1.
Nicholas

Posted Mar 30, 2006 at 12:57 PM | Permalink

TCO, I don’t understand what your question “WHAT GRAPH IS IT IN?” means. Perhaps the others are similarly confused.

You seem to be asking, in that question, what the difference in the meaning between the various PCs (PC1, PC2, PC3, PC4) is. I think that’s already been explained. However, since your question is confusing, it’s hard to know. Perhaps you ought to clarify what you are asking for?

I’m pretty sure the answer is that it’s pretty arbitrary. That is, the PC1 always explains more of the signal than the PC2, the PC2 always explains more of the signal than the PC3, etc. But as others have pointed out, where you stop is somewhat arbitrary and subjective, and how much difference there is between the PCs varies with your data. I personally think that if PC1, PC2 and PC3 do not have a hockey stick and PC4 does, it’s disingenuous to claim that your method has shown that the data is hockey-stick shaped, since there are clearly three stronger signals than that. However, it’s complex, and there may be reasons to attribute meaningfulness to a PC4. Ross, Steve and Martin all know much more about this than I do. I think if you phrase your question more clearly you’re more likely to get an answer.

I don’t think the final graph will differ much if no flipping had occurred AS LONG AS THE BRISTLECONES ARE STILL INVOLVED in the “Mannomatic” process. This is because they are postitive hockey sticks and require no flipping, and they get massive weight. I suspect if you remove the bristlecones, then flipping starts having a much bigger effect on the outcome, and I also suspect that with the red noise examples, the hockey stick index goes down heavily if you don’t flip. However since flipping is part and parcel of PCs I’m not sure it’s easy to demonstrate this. Again, others know much more than I do…

Please calm down. Take a pill or something. Be patient and answers will come.
Steve McIntyre

Posted Mar 30, 2006 at 1:17 PM | Permalink

#105. TCO, we talk about these issues in our E&E article. We have never “sidestepped” Preisendorfer. It’s discussed both in our articles and at considerable length in earlier threads on this blog, which you should be able to locate. It’s not incorrect to address these issues, but I’ve written a lot about them in the past in quite specific terms. If you search Preisendorfer on this site, you should be able to find the relevant materials. Look at each of the Errors Matter posts in February for example (they are listed in the MBH98 category.) There’s a lot of material.

If you caonsult the abstract of our articles, you’ll see the points that we were emphasizing. We were showing the biased MBH method, the non-robustness of the MBH98 reconstruction to slight permutations of methodology and the lack of statistical significance of the MBH98 reconstruction. The allocation of impact to sub-aspects of the MBH method (e.g. slipping versus mining) is not uninteresting. There’s certainly more that could be done. However, there’s only so much that I can do at any one time. This particular issue did not come up in any of our replies – that’s not to say that it’s an uninteresting question, but I still say that it’s secondary.

As I’ve said before, the main effect of MBH methodology is to overweight the bristlecones – where bristlecones are not involved.

One can object to this overweighting on two separate grounds – geographical overweighting (a point which we’ve made against MBH proposals to use all 15th century proxies without any regional aggregation or averaging) or on the grounds of being a questioned temperature proxy. But overarching both arguments is the issue of MBH misrepresentation of non-robustness.

Look, I don’t mind your questions and I’m mulling over a couple of issues, but I can’t go through them one by one. Again there’s a lot in our articles and prior posts that’s relevant and should answer many of the points.
TCO

Posted Mar 30, 2006 at 1:28 PM | Permalink

I know you’ve written about it in the past. How do you think I heard about it? I think that if I ask how much of the hockey stick graph’s HS index comes from bristlecone overweighting and am given an answer that applies to PC1 and not the HS itself, that that specific answer is non-responsive and could even be a little bit “slick”.
TCO

Posted Mar 30, 2006 at 1:30 PM | Permalink

It’s goddamn well interesting. I bet I could give you guys a lot more useful* grilling than A&W. 😉

*for both yourselves and for the general public.
Nicholas

Posted Mar 30, 2006 at 1:34 PM | Permalink

TCO, given that if you remove the bristlecones the hockey stick goes away, wouldn’t that make them 100% responsible for the hockey-stickedness?

There are still hockey-stick-like signals in there somewhere, because it mines for them, but the final output is not a hockey stick (as I remember it). Therefore it’s night and day. Can it really get any clearer than that?

The only person being slick here, as in hard-to-pin-down, is you I think. Sorry but that’s the way it seems. Every time you take a jab at Steve or Ross I just can’t see where you’re coming from.
TCO

Posted Mar 30, 2006 at 1:45 PM | Permalink

Nic:

1. There’s probably a lot you don’t understand about me. Hang in there on the technical questions and don’t worry about my motivation. It shouldn’t matter if I admire Steve (I do) or if I’m Michael Mann in a sock puppet (I…err…can neither confirm nor deny). If it’s a relevant point or question, it’s a relevant point or question. If it’s irrelevant, then it’s irrelevant (but see point 2 below).

2. If the only issue is whether the bristlecones are good proxies or not, then that is what we should engage on. However if one is going to complain about issues like off-centering and flipping and geo-weighting, then I’m justified in examining/probing/defining them. And if you’re going to run from a discussion on them to the defensive position of CO2 fertilization, then I can’t take the original kvetches on the methodology seriously.

3. Step back a second. Steve’s criticisms are much stronger if he can say that the HS is no good BOTH because of the overweighting AND because of the bad proxies. But if one needs to shift from an overweighting examination to a “bad proxy” discussion, then one doesn’t have independant critisms. In essense (if the overweighting were minor) then the main problem would be JUST the validity of the bristlecones. If the overweighting is NOT minor, then one has a valid criticism EVEN IF the BCP “bad proxy” concerns don’t hold up.
Martin Ringo

Posted Mar 30, 2006 at 2:42 PM | Permalink

TCO,
With respect to your question, I both 1) do not know the answer and 2) do know some problems in trying to compute it.

If we look at the weightings matrix (in PC = Data * Weightings, recalling the PCs are just linear combinations of the data) , we can see which series are “flipped” within each PC by looking at the signs, e.g the series AZ082 is flipped in the first PC but only given about one tenth the weight as the CA534 series (one of the hockey stick series). The problem in quantifying the effect, even in the PC itself, comes if we change the sign. The resulting vector (QuasiPC[.,1] = Data*Weightings[.,1] ) is no longer a PC in that it is no longer orthogonal to the other PCs. We could compute these quasi PC and determine the effect, but the results could be nonsensical because the first K quasi PCs do not necessarily span the same space as the first K actual PCs. That is the result would not be the equivalent of a partial derivative (or difference) from the sign reversal effect. You see the problem here?

Alternatively, we could sum the relative weightings (for each tree series) over the first K PCs, and if these were negative, we could flip the series in the data matrix then proceed with the rest of the MBH98 computation to compute the results. But I don’t know if that really would give you the answer your question asks because again by redoing the PCA the analysis introduces another variable(s) to total effect, which is then not necessarily the partial with respect to the series flipping.

Because of those problems, I am sympathetic to Steve and Ross’s, shall I say, inability to give you a quantification of the total flipping in the final reconstruction. Bit I am sympathetic to your question. Hockey stick series and flipped series appear in all the PCs used in the calibration. That calibration then determines the weights of those PCs, and hence the underlying series, in the reconstruction. I think Steve’s (Ross’s?) trick of adding an amount to a particular series (indicating warming) and watching for the effect of that flip (net of other reversals of sign) is a pretty clever way at finding what amounts to a conditional derivative (holding other data points constant but allowing for all weightings, not just the of particular series, to change) of a sign reversal. You can ask Steve if he iterated the process, say to a 0.1, to minimize the scale effect, and then ask for the two final series on each side of the flip. But personally, I view the flipping just part of the baggage that comes when one does PCA in general, be with an MBH98 transformation or not.

Sorry to be of so little help. I originally thought I might know a way to isolate the effect, but was wrong (again àⰃ à➩.
Armand MacMurray

Posted Mar 30, 2006 at 2:48 PM | Permalink

Re:#110:
It’s not your motivation that most are objecting to, it’s your perceived attitude.
Since you’ve dodged this issue before, I’ll ask flat out:
Have you read MM05 (GRL)?
Have you read MM05 (EE)?
Have you read MM03?

Second, remember that errors besides those that affect the shape of the final HS squiggle-plot can also matter. For example, what are the true error bars on the temp estimates of the past? If they span the full amplitude (or more) of the hockey-stick curve, then the HS isn’t significantly different from a flat line.
Also, as Steve and Ross point out, if the final HS squiggle-plot is not robust to relatively small changes in the method and/or data, the squiggle-plot is unlikely to be correct, *even though the shape of the squiggle-plot, as calculated, has not been changed*. My point is just that there are ways the squiggle-plot can be “wrong” without actually changing the trace on the plot.
Finally, you seem to be focusing on the HS *exactly as generated* in MBH98. Just because a given problem with the technique and/or data doesn’t make a big difference in the specific MBH98 calculation doesn’t mean that it might not cause problems in one of the other HS papers, or in some future reconstruction. Thus, it does seem useful to point out all clear problems in a method, rather than having to wait until someone actually makes a given error before cautioning against it.
Just some food for thought… (and I know you’re hungry 🙂 )
Nicholas

Posted Mar 30, 2006 at 3:06 PM | Permalink

TCO, Armand is correct, we (or at least I) do not care what your motivations are.

The problem is, some of the things you are asking are the statistical equivalent of “Have you stopped beating your wife yet?” That is you are asking for objective answers to subjective questions, and you are asking for quantative answers to non-quantifyable questions.

Please understand we are not attempting to change the subject. We’re attempting to answer your questions in such a way as to give you the understanding you seek. Unfortunately, sometimes that means we CAN’T answer your question directly because no answer to that question makes sense.

For example. You ask, how much difference does flipping make to the result of MBH98?

We can logically break down this question into these steps:
* Calculate the MBH98 results as per Mann’s original calculations.
* Calculate the MBH98 results as per Mann’s original calculations, except without flipping any series.
* Calculate the “hockey stick index” of each result.
* Get the difference of the indicies and give it to you.

Would you agree that’s logically equivalent to you asking how much effect the flipping has on the hockey stick index?

And would you agree, based on what has been said here (especialy what Martin Ringo said in #111), that we CAN’T perform a PC analysis without flipping?

If so, then you understand why there is no “straight” answer to this question.

Of course, it’s possible to spend a lot of time analysing the PC coefficients, fiddling the proxies, removing the inverted ones, and eventually come up with an educated guess about how much effect the flipping has. That guess may well answer your question. But you’ll have to understand, (a) that’s a fair bit of work and (b) it’s still not going to be an exact answer. I think Steve will probably get around to doing that analysis one day, and discuss flipping more. It would probably be possible to use PC analysis to calculate a good temperature reconstruction IF you were very careful how you used it AND IF you know a LOT about your proxies and had excellent confidence they had a linear relationship to temperature. Still, there are likely better methods. I really don’t think there are any methods where you can just throw all the proxies in and have the result magically pop out. That type of analysis makes far too many assumptions about the linearity and correlation of the data which almost certainly are not valid.
jae

Posted Mar 30, 2006 at 3:47 PM | Permalink

I really don’t think there are any methods where you can just throw all the proxies in and have the result magically pop out. That type of analysis makes far too many assumptions about the linearity and correlation of the data which almost certainly are not valid.

Yes! This statement defines the fatal flaw with this whole field of study. The researchers keep trying to find a magic statistical solution, while ignoring all the essential assumptions, EVEN the most fundamental one: do tree rings linearly record temperature changes, even under the best of circumstances (plenty of moisture, etc)? I still have not seen ANY kind of demonstration that this one central necessary prerequisite is met, even under the best of circumstances. The only relationships I have seen show growth varies with temperature according to an upside down U function (wonder what the general mathematical equation is for this…) It appears that the dendroclimatologists completely ignore this central issue, and they just ASSUME that growth (or density) is linearly related to temperature, without even acknowledging that they have made this assumption. Need any more really be said to cast a real cloud over the use of tree rings as proxies? The only ray of hope I see is with studies such as Rob Wilson’s, where trees that are are at the cold edges of their ecological niche are studied. The temperatures may rarely get high enough to reach the “plateau” on the inverted U. But that’s only reliable for relatively recent times, because those trees may have been subjected to much higher temperatures hundreds of years ago. It seems very possible that growth rates may have been retarded during the MWP, for example.
TCO

Posted Mar 30, 2006 at 4:05 PM | Permalink

Martin, great response. I very much appreciate your kind tone as well as how smart you are on math. I think fundamentally that if the effect is a flaw…that it must be a flaw because it changed the answer…and if it changed the answer, it must have changed it some amount. At the end of the day, if it’s impossible to quantify, then maybe we aren’t describing the right thing when we complain.

I agree that this is an issue of PCA in general, not of off-centered PCA so much (I think). In that case, of course, it’s very important (for clear discussion) that we specify that this is a concern of PCA in general and de-link the issue from “off-centeredness” (even if there is an interaction effect, between the two flaws that can be dealt with seperately as one more problem.)

Moving to thinking about the problem. Some stray thoughts:
a. could we list the series that are “flipped” in MBH specifically. then examine them and see which ones seem nonphysical and also assess the percent of total that are flipped?
b. Could we somehow compare to the example of simple averaging? Look at mannomatic PCA flipping versus simple averaging and see the percent difference in the two?
c. Hasn’t this concern with “flipping” come up in other places (sociology, econ, etc.) that use PCA? Maybe there are tools for handling/assessing/thinking about it?

Basic questions:
-when the PC has a series in it that is “flipped” and weighted, is that the only place where the series goes? Or does it get weighted and flipped (OR NOT FLIPPED) in other series? Can one examine the total amount of flipping and unflipping in the retained PCs? If one does the process out to include an infinite number of PCs, will you end up with the complete series (unflipped)? Does it go back to just being the data in the end?

Thoughtful question:
-how do we comment on what series are flipped? Steve had a very interesting example of a Trondheim instrumental series that was flipped. But surely some series like “annual snowfall” or “number of frozen days of the lake” would be EXPECTED to be inverse of ring thickness. Might have plenty of good info inside there. How do you tell what is acceptible and unacceptible flipping? What about the teleconnection arguments? (for instance someone might even go to the extent of saying that Trondheim does get colder when the overall climate field gets warmer.) I’m not arguing the warmer case here, I just want to think about how we do it right. If the method does both “wrong flipping” (Trondheim instrument) and “right flipping” (ice thickness), how do we resolve between the two? Even if we can’t, could we at least BOUND THE PROBLEM? Come up with a worst case?
Armand MacMurray

Posted Mar 30, 2006 at 4:39 PM | Permalink

Re:#115

I think fundamentally that if the effect is a flaw…that it must be a flaw because it changed the answer…and if it changed the answer, it must have changed it some amount.

Don’t forget that it can also be a flaw if it doesn’t produce a “correct” answer in general, even if it doesn’t change “this” answer. As a trivial example, specifying a fixed reconstructed temp anomaly of 0.5 for the first year in the reconstruction might be “correct” for some reconstructions, but that doesn’t mean that it is not a flaw.

If the method does both “wrong flipping” (Trondheim instrument) and “right flipping” (ice thickness), how do we resolve between the two?

I think that’s the point where you decide that the *method* is not physically appropriate.

-when the PC has a series in it that is “flipped” and weighted, is that the only place where the series goes? Or does it get weighted and flipped (OR NOT FLIPPED) in other series?

I think you mean “other PCs,” not “other series,” at the end of that last sentence. If I understand correctly, each PC is composed of all the series, just with different sets of weights. So, a given series could be flipped in one PC and not flipped in another.
TCO

Posted Mar 30, 2006 at 5:26 PM | Permalink

Armand:

a. Yes. I think I’ve discussed the difference between flaws that affect this case and flaws that don’t (but may affect others). I agree both can happen. And furthermore, we should clearly differentiate which are which.

b. Yeah…I meant PCs. I guess to assess the effect, we would want to know the percent of total “inappropriately” flipped series in all the series/retained PCs. (for the moment assume that we can specify which are flipped and which belong upside down.

c. I’m not so sure that it is easy to specify which series are wrongly flipped and which not. (This doesn’t validate the method btw.)
TCO

Posted Mar 30, 2006 at 8:59 PM | Permalink

Re:#110:
It’s not your motivation that most are objecting to, it’s your perceived attitude.
a. Since you’ve dodged this issue before, I’ll ask flat out:
Have you read MM05 (GRL)?
Have you read MM05 (EE)?
Have you read MM03?

b. Second, remember that errors besides those that affect the shape of the final HS squiggle-plot can also matter. For example, what are the true error bars on the temp estimates of the past? If they span the full amplitude (or more) of the hockey-stick curve, then the HS isn’t significantly different from a flat line.

c. Also, as Steve and Ross point out, if the final HS squiggle-plot is not robust to relatively small changes in the method and/or data, the squiggle-plot is unlikely to be correct, *even though the shape of the squiggle-plot, as calculated, has not been changed*. My point is just that there are ways the squiggle-plot can be “wrong” without actually changing the trace on the plot.

d. Finally, you seem to be focusing on the HS *exactly as generated* in MBH98. Just because a given problem with the technique and/or data doesn’t make a big difference in the specific MBH98 calculation doesn’t mean that it might not cause problems in one of the other HS papers, or in some future reconstruction. Thus, it does seem useful to point out all clear problems in a method, rather than having to wait until someone actually makes a given error before cautioning against it.

a. yes, yes, yes.
b. Fine. Then they can point that out in response (with a number and connected to a specific flaw…and not as some theoretical thing which might happen but we won’t be pinned down as to if it did.)
c. I don’t follow this. Is it same point as (b)? Does it make sense?
d. I get that, I got that, I already said that myself. I think it’s very important to differentiate between those types. It would be very wrong of someone if they cited a general methodological flaw (with minimal effect on the case at hand) and allowed the hoi palloi the mistaken impression that it made any goddamn difference with MBH!
Armand MacMurray

Posted Mar 31, 2006 at 12:08 AM | Permalink

Re:#118

It would be very wrong of someone if they cited a general methodological flaw (with minimal effect on the case at hand) and allowed the hoi palloi the mistaken impression that it made any goddamn difference with MBH!

I think that’s where I differ with you. It seems to me that whether MBH implemented their technique correctly or not has little to do with whether or not the technique is a valid way to reconstruct past temperatures. In the event, it seems that they did not implement it “correctly,” but even if they did, the technique is arguably not valid for temperatue reconstruction. If I understand you correctly, you’re asking essentially “what parts of MBH’s actual procedure result in a different answer from the correct temp reconstruction?” Since the “correct temp reconstruction” is unknown, we’re stuck with analyzing the *method* and the *statistical* properties of the result without being able to show that the result itself is correct.
We can’t show that the result itself is correct, but we can test some properties that we would expect a correct result/method to have. One of these is robustness to relatively small changes in the data and/or the method (my point c above). If removing the bristlecones gives a totally different result, that’s not robust. If changing centering conventions for the PC analysis gives a totally different result, that doesn’t seem robust. Robustness doesn’t directly affect the MBH98 graph, but it sure has a big impact on whether the whole graph is believable or not.
TCO

Posted Mar 31, 2006 at 3:48 AM | Permalink

AM: I’m examining the criticisms one by one and probing a bit.
Ross McKitrick

Posted Mar 31, 2006 at 8:47 AM | Permalink

TCO — #27,C.3:

3. Mann comments that straight averaging gives the same result as MBHing.

Not quite. He says that it’s possible to get an HS-like result without using PCA. Straight averaging yields the top panel of Figure 2 in our NAS presentation, which Steve has up here someplace and I posted here. There’s no resemblance to a HS from straight averaging.

Then we have the kerfuffle about geographic extent. but what does a straight average (with some geographic normalization–and how should one do that) give?

Geographic balance is intrinsic to evaluating the method. At an extreme, if 99% of the data are from my backyard and 1% is from the entire rest of the NH, it’s at best a representative sample of my backyard, not the NH. Robust geographic balancing should mean that we could drop any one small region and still get the same overall answer. The bcp’s are from one small region. If a method (PCA or otherwise) has the feature that you can’t drop them without destroying the final result, it doesn’t do a very good job of geographical balancing.

Also, why would one expect PCA to correct for geographic extent? Unless you do a multiple regression with geography as a variable, how can PCA do anything magic?

You don’t. PCA is just number crunching. To do PCA you have to put your data into groups, and the grouping, not the number crunching, is where geographical balancing is done. Just like the gridding process in global temperature averages. The idea in MBH98 was that the groups were chosen to achieve geographic balance. PCA is just the mechanism for crunching groups of varying size into a few averages. The grouping in MBH98 raises some odd issues, like why there’s a “group” for North America, but another “group” for US Southwest-Mexico, and why the Gaspe is its own “group”, as well as being part of the northern treeline sub-group, etc. But of those things the only one that matters in a measurable sense is the treatment of Gaspe, as you know.
TCO

Posted Mar 31, 2006 at 12:21 PM | Permalink

Ross:

a. Thanks for the picture. What does Mann say, how does it differ in appearance and method to your simple average. Anything else that you need to add regarding your simple average? I assume that it has all the series, that they are normalized (how?). How is orientation set?

b. Leave aside the “robustness” debate point for a second, how should one handle geographically imbalanced data to make the best representation of the overall average? This “feels” like it should be a common sense stats issue (I asked the question about two continents, for instance, a while ago and never got a good answer.)

c. How much does the “treatment of Gaspe” (I imagine giving it more geo-weight than it deserves) change the result numerically? How should they handle the grouping to be proper in geo-weighting? Can you average the series with geo-weighting without dividing the series into subgroups? I would be worries about more than Gaspe…isn’t it entirely possible that they are overweighting lots of other things in their grouping method? How does one do this right (for this aspect, leaving aside precip…blablabla)?
Bob K

Posted Mar 31, 2006 at 6:33 PM | Permalink

TCO,
I’m posting this because from my point of view, you are becoming a distraction. I wouldn’t doubt if others feel the same and are simply too nice to say so. Posting occasional questions is fine, but you put them out with a machinegun. It certainly distracts my train of thought when reading the threads.

RE: your post #122
About ten questions for Ross in an eleven line post. Many requiring extensive answers, some of which are likely answered elsewhere on the site already. Other trivia such as this one. “What does Mann say,…”. Like Ross is supposed to have some super-natural ability that allows him to speak for Mann and you simply can’t be bothered to check what Mann has said. Of course, to your way of thinking, Ross has nothing better to do, and your time is valuable.

You from: More on PCs

2. You still owe me answers/responses in the some thread, steve.

You still owe me? You still owe me? Boy! That’s exhibiting a lot of gall. You’re owed nothing.

You think you’re ‘owed’ a response to your every question. Telling Steve he should hunt down any questions you have asked that haven’t been answered, evidently because you’re too lazy to do it, and you think Steve has nothing better to do with his time. Really considerate of the use of Steve’s time, aren’t you?

You from: Some Principal Component Illustrations

1. Is it nescessary TO AVOID it? your justification seems strained. If it doesn’t matter, then why do I need to supply a reason for “not avoiding it”. Instead, you can supply a reason FOR AVOIDING it.

Steve said this: (The singularity may not “matter” but there’s no reason not to avoid it.)
As everyone can plainly see, you twisted the meaning of the his statement from ‘may not’ into ‘doesn’t’. Then you want to argue over your new meaning and demand he give you an answer you’ll accept. Really considerate of Steve’s time, again. That, to me, is trollish behavior.

You act like a spoiled child who isn’t getting enough attention.

You are ‘owed’ no answers or reponses. They should only respond when they feel it is an appropriate use of their time. Accurate reponses can take considerable time to write. How much time do you think they should spend on things you consider important relative to what they think is important? With you attempting to consume an inordinate amount of their time, should they devote little or no time to other posters to make up for you?

I come here almost daily in an attempt to glean a modicum of insight on a subject I am not well versed. I know I’m certainly more interested in what they consider notable than in what your lastest questions happen to be. It’s getting so my eyes start to glaze over when reading you posts.

You apparently want to query them on all the ins and outs of every relationship possible. I suggest you take a course in the subject and give them a break. They’re considerate fellows and try to be reasonably accommodating. Why don’t you exhibit some consideration yourself?
TCO

Posted Apr 6, 2006 at 7:02 PM | Permalink

Steve:

Just saw this:

Update – let me clarify a little. In the MBH regression stage (but not the tree ring PC stage), there are weights but it’s imposible to say that they are geographic weights. I don’t know what they are. A&W do calculations without any weights. There are some very odd QC issues. For example, there are 4 different series from Quelccaya ice cores used. Instead of averaging two different O18 series, they are used individually; same with two accumulation series. These are all in the small MBH99 network. The Gaspe series is used twice – once in the North American tree ring network, once individually. Series from Spruce Canyon CO are used 7-8 times – the EW and LW series are used in the Stahle/SWM network; then a near duplicate versions are sued (the first 100-125 values are identical); then the RW and MXD series are used in the NOAMER network; then the MXD is used in the Briffa composite and I think that the RW is used in the Fritts composite. In my grandmother’s trunk, there is a…

I think you edited it back into a post after I had replied already. I really advise against doing this. It is better to correct yourself with a post lower down. If you feel so, so scared of being misquoted by the evil guys, then put a bold comment in that there is an update below. but editing in stuff after the fact (or even the damn voiceofGod RealClimate style responses to other people) are poor for the discussion (don’t see a new post) and they seem heavy handed. If you want to close a thread or ban a user, fine. But don’t play unfair within the discussion.

Ok…on the content. Are you saying then that there WAS a geographic weighting of some sort? (and then seguing to what a miserable form of geoweighting they were)? And what about the “groups”. Do the groups provide “geo-weighting” or were they just used for regional effect mapping?

P.s. I would be nice to see a list of the errors in MBH. Organized into categories and a heirarchy in a MECE/Minto manner.
P.s.s. “interactions” can be listed too, but obviously they are a second order effect. I mean when I analyze a business and start with Revenue=price*volume. looking at each factor allows me to think about what the cause of a revenue change was. Of course, Ross can remind us that “elasticity” does exist. So the two variables are not independant. But we can still do a heirarchical trouble-shooting analysis. with the interaction as a second order issue to examine. (And it may not be relevant). In the example above, if I’m a minor producer in a free competition market, my change in volume has minimal effect on price.
Ed Snack

Posted Apr 6, 2006 at 11:14 PM | Permalink

TCO, I hate jumping on the “criticize TCO” bandwagon, but sometimes you seem particularly dense. Steve’s statement you quote includes “there are weights but it’s imposible to say that they are geographic weights. I don’t know what they are” and then you go on to say “Are you saying then that there WAS a geographic weighting of some sort? (and then seguing to what a miserable form of geoweighting they were)?”

Which part of “it’s impossible to say” don’t you understand ? Is it the apostrophe in the “it’s” that’s throwing you, as “impossible to say” seems quite clear to me ? There is some weighting system applied, but it doesn’t appear to correlate to anything obvious would seem like a reasonable translation to me.

Ease up buddy, I think you add some value trying to extract an ordered array of charges against MBH, but some clarity in your own thought processes would surely make that a less laboured process.

Apologies in advance, Steve, if you think me putting my oar in is unnecessary.
Rod

Posted Apr 7, 2006 at 5:52 AM | Permalink

A letter to this weeks New Scientist from Lawrence Neal, House Energy and Commerce Committee indicated that a panel of statisticians was looking into the Hockey Stick graph. This is addition to the NAS panel of which he is somewhat disparaging. Quote “A panel of statisticians is looking into this now, and a group assembled by the US National Assembly of Sciences is also examining the theory, though evidently without much focus on statistical underpinnings.”
TCO

Posted Apr 7, 2006 at 7:32 AM | Permalink

No sweat, Ed. I agree with your criticism of my questions as not perfectly incisive. In some cases, I can tell that we’re not getting clear, non-confounded issue analysis. But my attempts to probe are not always diligently constructed. IOW, “yes”.

Sometimes I can be a bee-yatch, too.

On the content: yeah, I saw where Steve called it “impossible to say”, but I was still concerned that it was a bit of a rowback from earlier firm assertion (I think he edited an earlier post after comments had gone on) and I wondered if the “hard to say what it is” is a bit of equivocation to make it look like less of a correction. If a more accurate statement would be “they did do a ‘funkky, flawed’ geo-weighting. If its completely unclear what the are doing or if a reasonable interpretation is that the “grouping” is geo-weighting, but of a poor sort.

There’s more to my question than just this post, because in another post, Steve printed colored graphs of the various groupings (the one where Dave D. gave me a bravo zulu). I’m wondering if those groups are the same “hard to understand weightings” which Steve mentions here). Because I think we would agree with calling those geo-groupings and (if compining them is different than combining the subordinate series without grouping) that this is geo-weighting. Essentially it looked in that post like taking a subaverage of geographic areas.
TCO

Posted Apr 7, 2006 at 7:34 AM | Permalink

Err…”of each geo area, then combining the subaverages”.
TCO

Posted Apr 10, 2006 at 4:21 AM | Permalink

Back to flipping. Is there any chance that the same series could be flipped in different directions at different time periods within the reconstruction because of the stepwise procedure? If that happens, well that’s just evil.
Jim

Posted Apr 13, 2006 at 8:40 PM | Permalink

I suggest that Steve start a new string filled completely with TCO’s posts. Whenever TCO posts to a particular string, remove the content to the ‘TCO’ string – but leave a notation that TCO posted a valuable contribution that can be found at the ‘TCO’ string. That way nobody can say that climateaudit is censoring its content – the content would merely be streamlined so that those who are participating in real discourse can avoid the Bristle Cone Pine series named TCO. TCO is becoming a blade unto himself. I suspect that he’s a grad student at Scripps or some other Temple of AGW who feels he can get back at those who have exposed his high priests by being a nuisance.
James Lane

Posted Apr 14, 2006 at 2:36 AM | Permalink

No Jim, TCO is an asset. While I am uncomfortable with aspects of his style, his signal to noise ratio is pretty high.

That said, TCO doesn’t demonstrate a very good understanding of PCA or its application in MBH. Some of his questions are difficult to answer because they don’t make sense in the first place. I think Ross was having problems with this in some of his “unresponsive” replies.
John A

Posted Apr 14, 2006 at 2:48 AM | Permalink

I suspect that he’s a grad student at Scripps or some other Temple of AGW who feels he can get back at those who have exposed his high priests by being a nuisance.

No he’s not. I know for whom he works so I can assure you he’s not a grad student at Scripps.

But even if he was, his output suggests someone who is trying to get to grips with the complexities, and not someone wasting Steve or Ross’ time just for the sake of it.
TCO

Posted Jul 29, 2006 at 9:44 PM | Permalink

PC1 with Mannian off-centering versus PC4 with covariance matrix is not a full appropriate comparison. Which PC does the HS fall in if we do correlation matrix?

And how many of your viewers understood the “changing 2 variables at once” issue that is hidden inside of the language “use the covariance matrix”, before I belabored it? Also, there are arguments for either correlation or covariance matrix, so you should discuss each. When pinned down, you do not get behind the covariance matrix! So certainly describing the covariance matrix as “conventional” is a bit of snuck in spin.