"Mannian" PCA Revisited #1

60 Comments

  1. SteveSadlov
    Posted Mar 10, 2008 at 1:08 PM | Permalink

    BCP Related – over at the CA Forum I have a thread where I am sharing general ecological, meteorological and cryological observations made in areas where Red Fire, Foxtail Pines (and in the future, BCPs) grow. I just returned from Douglas County NV, Alpine County, CA, El Dorado County CA and Amador County CA with new observations. I will link the Forum thread here in a post later today.

  2. Posted Mar 10, 2008 at 1:21 PM | Permalink

    If M is the number of proxies in the network (22 in the AD1400 network), the number of retained temperature PCs is set equal to the number of d_j greater than 2/M (.0909 for the AD1400 network) – see the description of “RuleN3″ in the source code.

    So, this is how the number of TPCs is calculated. But how about the selection of TPCs? [1 2 3 5 6 8 11 15] at 1750 and [1 2 3 4 5 7 9 11 15] at 1760 ..

    MBH98 explanation:

    We chose the optimal group of Neofs eigenvectors, from among a larger set (for example, the first 16) of the highest-rank eigenvectors, as the group of eigenvectors which maximized the calibration explained variance.

    But this is not true, because selection of [1 2 3 4 5 7 8 9 11 12 13] at 1820 step yields better calibration RE than the one used in MBH98 ([1 2 3 4 5 7 9 11 14 15 16]). Quite an algorithm, anyway ;)

  3. Steve McIntyre
    Posted Mar 10, 2008 at 1:29 PM | Permalink

    The selection seems to be done offline. There’s an option that permits Mann to state the PCs being used, but how is a particular selection made? It’s a mystery.

    Also I haven;t confirmed that this particular method. Note that Mannian short centering concentrates variance in the PC1 in this calculation as well. The loading of variance form a random matrix is sensitive to the autocorrelation in the red noise model as well and this is itself a source of some controversy.

  4. Jeff A
    Posted Mar 10, 2008 at 2:22 PM | Permalink

    Most of what you posted is a blur to me, Steve, lol. Is a layman’s summary possible?

  5. Bruce
    Posted Mar 10, 2008 at 2:37 PM | Permalink

    Most of what you posted is a blur to me, Steve, lol. Is a layman’s summary possible?

    Abracadbra … alakazam … hockey stick!

  6. Posted Mar 10, 2008 at 2:37 PM | Permalink

    JeffA:
    I think, in some sense, the lay man’s summary is the map with the dots on it.

    Imagine if you calculated the batting average for a baseball team of thee players like this:

    Average= (10* Joe + 1* Harry + 0.1* Jane)/(10 + 1 + 0.1)

    Clearly, Joe’s average dominates your result. If you told people that you averaged over all three players, that would be deceptive. Yes, you did use Harry and Jane’s averages, but in a way where they have practically no influence on the results.

    There are some other nuances having to do with having patched in to estimate Joe, Harry and Jane’s averages, because of incomplete data, and doing a few other things. But, unless you have a very, very good reason to claim that you should weight Joe’s performance more than the others, there is a problem with this sort of weighting.

  7. I"heart"heidicullen
    Posted Mar 10, 2008 at 2:48 PM | Permalink

    speaking of proxy dicrepancies, and Mann, Jones, etc. i was searching for time-series glacial photos showing retreat from say 1880 to 1960 (not scary, non AGW) and came upon this IPCC 2001 chapter.

    http://www.grida.no/climate/ipcc_tar/wg1/064.htm

    fascinating! the last 2 paragraphs are very good along with the graph in pointing out the “unexplained” discrepancies of a possible mid-19th century warming signal. sorry if this is off-topic and if you have covered it before. of course we are coming out of an ice age, but could waldo be hiding in the glaciers prior to AGW?

  8. John Hekman
    Posted Mar 10, 2008 at 2:52 PM | Permalink

    the map with the dots clearly indicates that unprecedented warming is……regional!!

  9. Patrick Hadley
    Posted Mar 10, 2008 at 2:54 PM | Permalink

    I would like to ask about Professor Ian Jolliffe, whose book seems to be the definitive guide to the correct use of PCAs. Tamino uses that book to justify the use of non-centred means. In a powerpoint presentation on PCAs “To centre or not to centre, or to perhaps do it twice” Jolliffe appears to say that it is OK to use non-centred means when the data is in the form of anomalies, but that otherwise he does not recommend it.

    I do not know whether proxy data from tree rings counts as anomaly data. Can anyone explain whether Jolliffe is behind MBH or not? He is a certainly very highly regarded statistician who is an expert on climatology, does he give the use of non-centred means by MBH his seal of approval?

  10. JS
    Posted Mar 10, 2008 at 3:39 PM | Permalink

    A technique is neither right or wrong on its own. It needs to be applied to the approrpiate circumstances. Thus, one can’t make the blanket statement “it is OK to use non-centred means when the data is in the form of anomalies” any more than one can say that “concrete is the best building material”. And even then, one can use a ‘good’ technique badly. I have seen many regressions where someone puts a grab bag of variables on the right-hand side, drops all of them with a t-stat below 2 and ends up with a supremely misleading (and invariably lacking in any robustness) model. You can’t blindly follow a recipie with this stuff and expect to get a meaningful result.

  11. Jean S
    Posted Mar 10, 2008 at 4:29 PM | Permalink

    #9: Patrick, he does not give an approval to MBH. It is just that certain people try to confuse (by, e.g., using imprecise language) you to think that Jolliffe’s presentation has something to do with MBH-style PCA. Let me try this again:

    -Jolliffe’s non-centered PCA means that nothing is removed from the series.
    -MBH is not doing non-centered PCA in the sense of Jolliffe. They are removing an estimated mean (average) from the series. Not the whole sample average, but a sample average of a part of series (calibration period). Steve is using the term “short centering” above for this. I used the term partial centering earlier.

    Jolliffe states the case for using non-centered PCA (slides 21 & 24):

    It seems unwise to use uncentred analyses unless the origin is meaningful. Even then, it will be uninformative if all measurements are far from the origin

    One case where uncentred analyses are appropriate is if we can assume that the population means of our variables are zero, although the sample means are not

    To put that in the MBH short centering context is essentially to say that if you take long enough tree ring series (say tens of thousands of years) then the mean of that would be the same as the average you calculated over the calibration period (1902-1980), i.e. your “origin”! And this should be true for all of your tree ring series. A reasonable assumption?

    Additionally, I’d like to draw your attention to the fact that this Mannian PCA was not described in MBH. They stated that they used “conventional PCA”. So if they truly believed that, for some reasons I can’t imagine, this “PCA” was superior to normal PCA in the situation in hand, why didn’t they even mention it? Why didn’t they tell that for these and these reasons we are not using conventional PCA, but (our own) modification of it? Or more importantly, why nobody has yet to come up with a reference in literature, where this “short centering PCA” is analyzed and/or justified? Or even used besides MBH.

  12. EJ
    Posted Mar 10, 2008 at 4:39 PM | Permalink

    Wow, one need only look at that map and wonder. No data from Africa, one data point in Asia. And to weight tree rings heavily? Sheeesh.

  13. SteveSadlov
    Posted Mar 10, 2008 at 4:40 PM | Permalink

    As promised. Here is thread, over at the CA forum, on the topic of possible future studies to correlate ring width and latewood density responses to temperature, and / versus precip (especially cold season snow pack provided moisture) for BCPs and other species found within 100 miles’ radius, at upper treeline, in areas with a reasonably similar synoptic scale and topographical climate:

    http://www.climateaudit.org/phpBB3/viewtopic.php?f=5&t=67

    Some of the other species of interest may be Red Fir, Jeffry Pine, Foxtail Pine, and possibly even Ponderosa Pine.

  14. Kristen Byrnes
    Posted Mar 10, 2008 at 4:53 PM | Permalink

    Would centering the PCA be like ballancing a hockey stick on the 5 foot snow banks in front of my house?

  15. Michael Jankowski
    Posted Mar 10, 2008 at 4:55 PM | Permalink

    Re#7, I’ve raised the issue of the “unexplained discrepancies” between glacier retreat and the surface record as stated in the 2001 IPCC TAR before. Gavin at RC responded that ‘he didn’t know why they said that’ and said any discrepancies were ‘overblown.’ I brought the paragraph or two up on Tamino a year or so ago to prove a point, and I was told I couldn’t discuss anything from 2001 since the 2007 IPCC report was out.

    One would think that a physical and relatively global suggestion that the surface record is substantially inaccurate at least as recently as 150 yrs ago would be of high importance.

  16. Pat Frank
    Posted Mar 10, 2008 at 5:06 PM | Permalink

    Here’s the major concern about PCA applied to temperature reconstruction that seems to be consistently ignored by virtually everyone.

    The interpretation of the PCs can be di±cult at times. Although they are uncorrelated variables constructed as linear combinations of the original variables, and have some desirable properties, they do not necessarily correspond to meaningful physical quantities.

    This statement can be found in “A survey of dimension reduction techniques” by Imola K. Fodor, available on the Lawrence Livermore Nat’l Labs website, here.

    The long and short of it is that principal components are physically meaningless. In order to derive physical meaning from PC’s, they must be interpreted through a physical theory.

    Proxy temperature reconstructions are making a scientific assertion, and do not represent an inference restricted to the field of Statistics. Zeroing, normalizing standard deviations, rescaling to and ‘training against’ a time-series temperature measurment does not turn a PC into a physically meaningful temperature trend.

    This whole business of blind re-interpretation of tree-ring PC’s, or any other sort of core-derived PC’s, as physically meaningful temperature series absent a physical theory is a scientific grotesquerie. It is no more than false precision, empty of scientific meaning. It is wrong. It is scandalous.

  17. Jean S
    Posted Mar 10, 2008 at 5:28 PM | Permalink

    #16 (Pat): I don’t think that’s a problem of PCA in the intended meaning in MBH. They are not giving any meaning to PCs. PCs are treated as if they were other proxies. That is, they are not assumed to be meaningful physical quantities, but only linear in temperature. That assumption is IMHO questionable, but that’s another question. If the original tree rings are linear in temperature, so are PCs as they are simply linear combination of the original series. The actual reconstruction of temperature series out of proxies is suppose to come later in the MBH algorithm (which, of course, does not work out that way).

  18. Terry
    Posted Mar 10, 2008 at 5:38 PM | Permalink

    Steve says:
    Tamino makes a try, claiming that Mannian methodology is within an accepted literature, that it has desirable properties for climate reconstructions and that there were good reasons for its selection by MBH.

    For the most part I have to agree with Tamino here.

    … that it has desirable properties for climate reconstructions… This is true since it produces the hockey stick and this was the desirable property that they wanted.

    … that there were good reasons for its selection by MBH…. Also correct since the reason it was selected was because it produces the hockey stick.

  19. Stan Palmer
    Posted Mar 10, 2008 at 6:00 PM | Permalink

    re 19

    I hope that this is not a distraction but how does Mann select one PC out of several and say it relates to temperature

  20. Pat Frank
    Posted Mar 10, 2008 at 6:07 PM | Permalink

    Jean, #17, as soon as PC1 is scaled to temperature, in MBH and everywhere else, it’s immediately interpreted as representing a physical temperature anomaly. That’s imposing physical meaning by mere assignment. MBH’s argument is in the province of science, not in a province of statistics. As such they are making a claim — superposition of PC1 with a measured temperature converts PC1 into a temperature anomaly — that is entirely unjustified within science. That, no matter that the statistics is rigorous (or not, as M&M have demonstrated).

    The first scandal is that a trained physicist, Michael Mann, made such a claim. They second scandal is that it passed peer review and was published. The third scandal is the unholy rush of other trained proxy climatology scientists to embrace a false method — almost certainly because it gave them pseudo-results that revolutionized their field from climatology into thermometry, that in turn granted them fame and tenure. The fourth scandal is that few or no other highly trained physical scientists have called them on it in peer-reviewed print.

    I’ve tested my opinion by asking various physicists and statistically savvy mathematicians about the physical meaning of principal components. The uniform answer is that they have none; no physical meaning. Nevertheless, there is no outcry in that field (climatology), or any other, that quantitative physical meaning is being assigned, by mere qualitative inspection, to PC’s from climatological core series.

    It’s a terrible scandal; part of the enormous scadalous matrix that is the claim of AGW.

  21. old construction worker
    Posted Mar 10, 2008 at 6:32 PM | Permalink

    Kristen says ref 14
    “Would centering the PCA be like ballancing a hockey stick on the 5 foot snow banks in front of my house?”

    LOL May I add in a “howling” wind storm

  22. Kenneth Fritsch
    Posted Mar 10, 2008 at 7:47 PM | Permalink

    Re: #4

    Most of what you posted is a blur to me, Steve, lol. Is a layman’s summary possible?

    As a layperson in these matters I attempt to understand the importance of what is being said by looking for things like sensitivity analyses (which in this case goes to the heart of the geographical concentration of data and the performance of the reconstructions depending greatly on the [questionable] use of bristle cone pines); what one finds by reading the fine print of what the authors have referenced and attributed to a reference that might not hold up to closer scruntiny and whether the methodolgies are newly minted or well accepted practices.

    Scientists and statisticians do not use the term but one needs to look for any BS — as creatively and imaginatively as it might appear.

  23. Patrick Henry
    Posted Mar 10, 2008 at 8:13 PM | Permalink

    I attempted to post this on Pit Bull’s blog, and it was censored. I wonder why?

    Dr. Hansen recently gave a presentation at Illinois Wesleyan University, where he attempted to demonstrate that man-made global warming was causing a meltdown in Greenland.

    He failed though to mention though that every single long-term GISS record from Greenland shows that the 1930s and 1940s were at least as warm as the present.

    http://data.giss.nasa.gov/cgi-bin/gistemp/gistemp_station.py?id=431042500000&data_set=1&num_neighbors=1
    http://data.giss.nasa.gov/cgi-bin/gistemp/gistemp_station.py?id=620040630003&data_set=1&num_neighbors=1
    http://data.giss.nasa.gov/cgi-bin/gistemp/gistemp_station.py?id=634011520003&data_set=1&num_neighbors=1
    http://data.giss.nasa.gov/cgi-bin/gistemp/gistemp_station.py?id=431043600000&data_set=1&num_neighbors=1

  24. Ross McKitrick
    Posted Mar 10, 2008 at 8:42 PM | Permalink

    #4: Jeff, maybe http://www.climateaudit.org/index.php?p=166 would provide the needed background.

  25. Ross McKitrick
    Posted Mar 10, 2008 at 9:18 PM | Permalink

    It’s very telling that Tamino fails to link to a single one of our papers, and his cheerleaders and groupies sure haven’t read them. I wonder if Tamino himself has read our E&E05 paper, where we discuss, among other things, the various ways to go between a hockey stick and non-hockey stick reconstruction. One of the ways we discuss is to use centered PCA and vary the list from 2 to 5 PCs in the NOAMER network (see pp 75-76). His post presents it like he discovered it, and we didn’t mention it. We discussed the fact that the hockey stick shape drops to the 4th PC (with accompanying collapse in the associated eigenvalue) in our GRL article as well. He also ignores the problem of insignificance even with 5 PCs, the zero r-squared scores first shown by M&M and then confirmed by Wahl&Ammann, the lack of robustness to excluding the bristlecones, etc.

    Tamino claims great insight into why Mann used de-centered PCA in a form that preferentially weights the small number of hockey stick-shaped proxies in the PC1. Yet MBH never stated they were doing non-standard PCA, much less explained to their readers why they were doing it, so his revisionist history is pure speculation.

    I guess Tamino also hasn’t inspected the CENSORED folder or grasped the meaning of what was in it. Without the BCP’s the decentering doesn’t matter since the remaining series all have stable means, and no hockey stick emerges in any PC. Decentering mattered once a couple of hockey stick-shaped series were inserted into the network; then the Mann-method mined for the shapes and loaded them into the PC1.

    The issue with the hockey stick has always been robustness. Slight changes to the proxy roster overturn the results, slight changes to the method overturn the interpretations, and the test scores do not support the claim of significant extrapolative ability. These things were known in 1998, and if MBH had reported them, their result would have been seen differently from the outset–as a stab at combining some linear estimators, but unable to support any important conclusions.

  26. Pat Frank
    Posted Mar 10, 2008 at 9:40 PM | Permalink

    Ross #26 wrote, “so [Tamino’s] revisionist history is pure speculation.” (underline added)

    You’re being far too kind, Ross.

  27. Posted Mar 11, 2008 at 1:04 AM | Permalink

    #18

    … that it has desirable properties for climate reconstructions… This is true since it produces the hockey stick and this was the desirable property that they wanted.

    Indeed, MBH99 result corroborates alarmist- AGW theory extremely well. Two natural time-scales (millennial and annual), due to astronomical forcing and weather, and then clearly dominating A-CO2 effect superimposed to those. Designed to kill LTP / 1/f / unit-root models for climate series (Woodward, Grey et al published something related in ’90s., iirc)

  28. fred
    Posted Mar 11, 2008 at 1:44 AM | Permalink

    I’ve posted on tamino recently. There is a fair amount of boorishness lately – some of it may be because of tamino’s defining himself as Hansen’s Bulldog, and the occasionally intemperate tone he’s adopted. Some because of a mistakenly relaxed moderation policy which encourages the trolls. But it shows signs of improving. The basic explanation of PCA was good, and it was a pity that the MBH material in the last one has been somewhat unfocussed. The result is that it has taken the thread something like 250 sometimes very acerbic comments to get to the bottom of MBH style PCA.

    Me included, by the way. I wouldn’t have got to it properly without having gone through Tamino’s defence and feeling that this surely could not be right. Another good thing that has come out of it is that some people are now using R and posting results. The end result may be to improve the quality of discussion on his blog greatly, even if he should eventually agree he was mistaken about MBH’s PCA.

    It would be a good thing if so. Tamino has a real talent for education. He can take something complicated and confusing and explain the main lines of it very clearly, as one has seen both in the PCA postings and some others. One hopes he’ll moderate and make commenters do so, as he says he will.

    The striking thing about the blog in this episode has been social. There has been a claque with a very strong party line. It is not all commenters, but its a substantial number. Among them, the validity of MBH and the HS are articles of faith. To doubt them is to be a denier. It is logically perfectly possible to think MBH wrong but AGW a true and important thesis. The claque even admits this, and keeps saying that the validity of MBH does not matter, so we should all move on. At the same time, they react with complete fury to any questioning of it. The only place I’ve ever come across this insistence on defending to the death issues which are simultaneously contended not to be central to the main argument is in rather extreme and fundamentalist political and religious sects. One could perfectly consistently have believed that the Katyn massacre was done by out of control Russians, and that the Russian regime was an example for humanity, just not perfect in this instance. But no-one in the Party was able to do that.

    In fact, simply to enquire about the exact nature of the IPCC account, without any suggestion of scepticism, will result in being called a denier by some posters. I have been accused of being a denier for pointing out that if the climate sensitivity to CO2 doubling is 4 degrees, 1.2 is due to the direct effects of CO2 and the laws of physics, but 2.8 is due to complex positive feedback mechanisms.

    It is striking that some of the more egregious examples of denialism occur in the ranks of the faithful, who spend so much of their energy accusing enquirers of it!

    However, its not all negative. One of the posters is recently reading the M&M material and reproducing the arguments using R. Whether he turns out to agree or not, this will be improve the blog.

    Personally by the way, I don’t know about warming. I know I do not believe in decentered or whatever it is PCA. I’m not convinced by high numbers on CO2 sensitivity either, they really don’t seem evidence based. But I can’t see it is sensible to cheerfully make large changes to the atmosphere and just not worry about it.

  29. John A
    Posted Mar 11, 2008 at 2:38 AM | Permalink

    I think its telling that despite such clear multiple independent demolitions of the Hockey Stick, climate alarmists cannot let go of the Hockey Stick – because it clearly shows in graphical form that which they believe to be true about the world.

    The one characteristic which encapuslates that belief is the fact that the Hockey Stick appears to follow the carbon dioxide Siple Curve. Thus it demonstrates a belief in greenhouse warming, whatever the ice core records may say.

    Its not that Steve and Ross’ criticism has been shown to be even slightly mistaken – its more to do with the will to believe something that the scientific method says should be rejected.

  30. mccall
    Posted Mar 11, 2008 at 2:40 AM | Permalink

    In Tamino’s and I would say Dr Quiggin’s ham-handed interpretations of your work (RC is deliberately omitted as a COI in the social network), what was the rough timeline of event discovery for the sum points by which M&M has been you’ve critical the HS?
    1) Decentered PC’s
    2) Noise mishandling
    3) Proxy selection and bias
    4) Misuse of Preisendorfer
    5) CE vs. other correlation coef’s
    6) The hidden BCP proxy directory
    7) General BCP misuse as temp proxies
    8) Others

    The chronology of 3-7 would be of particular interest?

  31. mccall
    Posted Mar 11, 2008 at 2:42 AM | Permalink

    Correction: … timeline of event discovery for the sum points by which M&M has been critical of the HS?

  32. David Holland
    Posted Mar 11, 2008 at 3:47 AM | Permalink

    Re #9
    I asked Dr Jolliffe if he endorsed Mann’s non-centred PCA and this was his email reply on 25 Feb 2005

    I’m afraid that I can’t offer you much enlightenment. I did not hear
    Michael Mann on the Today programme. Nor do I know what ‘seminal
    work’ you refer to, or how or why he references me. From your email
    it may be a talk I gave in Cape Town last year, which was a brief review
    of alternative centerings – I can’t see that it said enough to used as a
    recommendation.

    My one (anonymous) interaction with Mann, his co-workers and his critics
    was last year when I acted as a referee for an exchange of views
    submitted to Nature. After a couple of iterations I came to conclusion
    that I simply could not understand what was being done sufficiently
    well to judge whether Mann’s methodology was sound, but I certainly
    would not endorse it. At least one other referee came to same conclusion.
    Although the exchange was not published in Nature I believe it may have
    appeared on a web site. I don’t know whether the methodology noted in
    your email is the same as that which referees found too opaque and/or
    complicated to understand.

  33. MarkW
    Posted Mar 11, 2008 at 4:02 AM | Permalink

    I find it very interesting that the proxies that show warming are assigned much more weight than the proxies that show cooling.
    Does anyone know the justification for this weighting?

  34. Jean S
    Posted Mar 11, 2008 at 4:07 AM | Permalink

    Steve, slightly off topic, but now that you have the link to the source code (my Fortran reading is lousy). This may be old news to you, but I don’t recall it mentioned anywhere:

    Me and UC were fine tuning our MBH emulators a while back. One of the annoying discrepancies was that we were getting lower verification REs than reported although our emulations otherwise seems very good. Even WA seems to have run to the same problem; they even have one hand-waving appendix for the issue (Appendix 4). Well, I think we found the reason (can anyone confirm this from the source code?): it seems that Mann is calculating verification REs with respect to “sparse” reconstructions! That is, verification REs are not calculated from the actual (stepwise) reconstructions, but from NH stepwise reconstructions obtained by limiting reconstruction grid cells to those used for calculating the sparse instrumental temperature.

    Steve: Yes, this is what he says he does. I have code to extract the “sparse” dataset.

  35. David Holland
    Posted Mar 11, 2008 at 4:24 AM | Permalink

    Re# 30, John,

    You are right to point to the “marketing appeal” of the hockey stick but it is also the lynch pin of IPCC, 2001, which Tamino thinks we should now forget. See IPCC, 2001 WGI Chapter 12 (Attribution) page 702:

    We expect, however, that the reconstructions will continue to improve and that palaeo-data will become increasingly important for assessing natural variability of the climate system. One of the most important applications of this palaeoclimate data is as a check on the estimates of internal variability from coupled climate models, to ensure that the latter are not underestimating the level of internal variability on 50 to 100 year time-scales.

    We shouldn’t get into conspiracy theories or too much social network analysis but note who were the lead authors of that chapter and of the Palaeo Chapter it was referring to and then remember who were the Review Editors of IPCC, 2007 WGI Chapter 6.

  36. Geoff Sherrington
    Posted Mar 11, 2008 at 4:50 AM | Permalink

    Steve, beautiful work. Please keep hammering this.

    Suppose that tree rings in carefully controlled conditions did serve as a good guide to air temperature. Then go out to Nature like you did, to find that sometimes the air was cold from the North and then warm from the South. Lacking fine detail of winds, one is not allowed to assume that a PCA over decades will unravel useful predictive ability linking temp to tree rings. The calibration period will lack essential data and the PCA, done conventionally, will not indicate a comprehensible outcome.

    Dreamin’.

  37. Andrew
    Posted Mar 11, 2008 at 6:03 AM | Permalink

    The irony is that such a flat climate in the past is kind of an argument for a low climate sensitivity…
    Tamino and RC must know this. They seem to cling to the HS becuase it is effective propaganda.

  38. Patrick Hadley
    Posted Mar 11, 2008 at 6:58 AM | Permalink

    David Holland #37, 4.34

    Thank you very much for that quote from Professor Ian Jolliffe.

    We can certainly see what he thinks about MBH from the quote: “I came to conclusion that I simply could not understand what was being done sufficiently well to judge whether Mann’s methodology was sound, but I certainly would not endorse it.” I cannot believe that Bulldog would have used Jolliffe’s book as an authority if he had been aware of that comment.

    It is a little surprising that someone like him who is clearly an expert statistician specialising in Climatology who has written very important works on PCAs and (weather) Forecast Verification does not have more curiosity about the methods used in MBH98, especially since his text book is used on Real Climate and Bulldog to justify them.

  39. Posted Mar 11, 2008 at 7:05 AM | Permalink

    A bit OT but,

    In #28 I referred to [1], where they write:

    The authors consider the problem of determining whether the upward trending behavior in the global temperature anomaly series should be forecast to continue.

    And they say that it might not be reasonable to forecast the future temperatures as increasing (based on the data only). This was published in 1995, and few more cold months (say zero anomaly for HadCRU for the rest of the year) , and 1996 – 2008 monthly anomaly will trend down.. (That wouldn’t change anything, just a note)

    [1] Selecting a Model for Detecting the Presence of a Trend, Wayne A. Woodward and H.L. Gray, Journal of Climate, Aug 1995

  40. Michael Jankowski
    Posted Mar 11, 2008 at 7:31 AM | Permalink

    I cannot believe that Bulldog would have used Jolliffe’s book as an authority if he had been aware of that comment.

    I’m sure he wasn’t aware of that comment, but would that really stop him? All he has to do is convince his readers that Jolliffe supports MBH98 methodology. Looking at many of the responses on that thread, he has. They aren’t going to come over here and see Jolliffe’s comment. Tamino’s not going to let it get posted on his site.

  41. Ross McKitrick
    Posted Mar 11, 2008 at 8:23 AM | Permalink

    When thinking about Tamino’s attempt to present PCA on decentered data as some kind of legitimate methodological innovation, two points stand out:

    – The source he cites is a powerpoint slide show on a tangential topic, from someone who has said he does not endorse the particular method in question;
    – de-centering on a subsample merely inflates the variance of vectors whose means shift over that subsample. Since the choice of subsample interval is arbitrary, the relative inflation of variances is arbitrary. That’s why PCA on centered data is standard, and departure from the standard requires explicit disclosure and justification. If the results depend on the particular form of decentering you have to explain this and make a case why the arbitrary transformation of the data is necessary. You could make a case for picking the earliest period for centering, and then the series that get boosted to the PC1 would be those with a trend in the 1400s. That wouldn’t “prove” that those series are the most representative of the global climate though.

  42. RomanM
    Posted Mar 11, 2008 at 8:58 AM | Permalink

    You make a very good point which should not be overlooked. In his book, Jolloffe allows (grudgingly, IMHO) that a case might be made to centre at “the origin” if the origin is a meaningful value:

    (i) the columns of X are left uncentred, that is xij is now the value for the ith observation of the jth variable, as originally measured;

    As noted by Ter Braak (1983), the technique projects observations onto the best fitting plane (or flat) through the origin, rather than through the centroid of the data set. If the data are such that the origin is an important point of reference, then this type of analysis can be relevant.

    (Principal Component Analysis, p. 389)

    The italics were not added by me but appear in the text. The entire section appears to be more of a matter-of-fact exposition of what is done by climate scientists than an advocacy of the correctness of the technique. It seems pretty clear from the italics that the centering be in respect to a fixed value and not an arbitrary value estimnated from the sample. As you correctly indicate, the latter would greatly increase the uncertainty of any results.

    As a matter of note, his treatment of the procedure in the powerpoint (the lack of interpretability of the second “component” in his example and the warnings regarding its use in the final slide ) seem to clearly indicate his understanding that this is a questionable process.

  43. Kenneth Fritsch
    Posted Mar 11, 2008 at 9:07 AM | Permalink

    Re: #33

    After a couple of iterations I came to conclusion
    that I simply could not understand what was being done sufficiently
    well to judge whether Mann’s methodology was sound, but I certainly
    would not endorse it. At least one other referee came to same conclusion.
    Although the exchange was not published in Nature I believe it may have
    appeared on a web site. I don’t know whether the methodology noted in
    your email is the same as that which referees found too opaque and/or
    complicated to understand.

    I think I should have added to my list of layperson’s methods of evaluation of statistically and scientifically technical papers, the opaqueness factor as noted above and the suspicion and/or blatant use of cherry picking data inputs, or outputs, for that matter. Cherry picking can sometimes be revealed in the sensitivity analysis I noted earlier, but sometimes it obvious without analysis. It all goes along with looking for the B word tendencies.

    I have aired this view previously but, in my mind Mann’s magnum opus gave the climate scientists a measure of past climate that fit their already formed consensus view so well that it was difficult for any but a few climate scientists to deconstruct what was done. It would appear that climate science has moved on to preferentially using climate models over climate reconstructions, but there are many climate science supporters (and a few scientists who are hardcore Mann supporters) of the consensus view that evidently do not have sufficient conviction of their views to allow for the Mannian mistakes.

  44. Mark T.
    Posted Mar 11, 2008 at 9:11 AM | Permalink

    – de-centering on a subsample merely inflates the variance of vectors whose means shift over that subsample.

    Therein lies the problem. Incorrect centering inflates the variances with a DC bias, placing emphasis on proxies that have time varying means. Those that have stable means, near the chosen center point, will be unaffected. A similar problem arises with Mann’s ergodicity assumption in his RegEM method. Not that he explicitly assumed ergodicity, but his method for taking the mean of the means and applying it to the entire block does just that. Jean S first noticed this.

    Mark

  45. steven mosher
    Posted Mar 11, 2008 at 9:18 AM | Permalink

    re 32. see if you can get that posted in tammy town

  46. Posted Mar 11, 2008 at 10:35 AM | Permalink

    Mark,

    Those that have stable means, near the chosen center point, will be unaffected. A similar problem arises with Mann’s ergodicity assumption in his RegEM method.

    Ah, another accident. BTW, any guesses why Rutherford et al. 05 shows only 1400-present reconstructions ?

  47. Mark T.
    Posted Mar 11, 2008 at 10:43 AM | Permalink

    I wouldn’t doubt if it were actually an accident, though a good scientist would strive to learn the problems with such assumptions (explicit or implicit), and correct any errors that may have resulted in previous analyses.

    Perhaps the DC bias prior to 1400, such as during the MWP, which coincidentally shows up in most of the proxies (as I recall), would cause problems for the “warmest in a milluuuuun years” claim? :)

    Mark

  48. Bernie
    Posted Mar 11, 2008 at 11:07 AM | Permalink

    In the work that I do with survey data, I am always nervous when using PCA to identify a more efficient way to summarize the data that what I am finding are groups of respondents not groups of related explanatory variables. The only way to really tell, as Steve points out above, is to carefully check that the variables loading on a particular factor make real sense and do not reflect some as yet unmeasured variable that links a subset of respondents. This inspection has to be done factor by factor.
    My read of what Steve and Ross (and others here) have done is point out that PC1 or the HS factor is a characteristic of group of respondents (the BCPs) and not the underlying temperature signal.
    Is this a correct interpretation?

  49. Jeff A
    Posted Mar 11, 2008 at 11:15 AM | Permalink

    Imagine if you calculated the batting average for a baseball team of thee players like this:

    Thanks Lucia! I was kind of thinking that’s where it was headed, but wasn’t sure.

  50. Jeff A
    Posted Mar 11, 2008 at 11:26 AM | Permalink

    Jeff, maybe http://www.climateaudit.org/index.php?p=166 would provide the needed background.

    Thanks Ross!

  51. Mark T.
    Posted Mar 11, 2008 at 11:48 AM | Permalink

    My read of what Steve and Ross (and others here) have done is point out that PC1 or the HS factor is a characteristic of group of respondents (the BCPs) and not the underlying temperature signal.
    Is this a correct interpretation?

    If you amend that to may not be the underlying temperature signal then yes, your interpretation is correct. This is confounded by the fact that the HS signal is predominantly present in BCPs, which end up getting weighted heavier than other proxies. Whether or not it is truly temperature is hard to say without removing other factors that may be contributing to their shape, too. PCA does not assign a “flag” to the results indicating which is which, and when multiple inputs to the system (e.g. solar, precipitation, CO2 fertilization) are correlated, disaggregation is even more difficult (if not outright impossible).

    The short answer is that the “signal” that shows up is assumed to be temperature simply because the surface temperature readings are increasing at the same time, and the BCPs are assumed to be responding to temperature, not other factors. That BCPs will necessarily respond to local temperatures, not global temperatures, is lost on the proponents of the proxy reconstruction theory (via PCA), btw.

    Mark

  52. Jeremy
    Posted Mar 11, 2008 at 12:52 PM | Permalink

    If I were a mathematician, I would say the lack of a gridded data set representing a physical system governed by equations that can be analytically solved invalidates the entire use of PCA methodology in the analysis of any climate data. Mathematicians are sticklers for proper use of methods, and rightly so. They are the “grammar” [enforcers] of the scientific world.

    The reasoning is simple. You have no gridded surface data for temperature or rainfall or any weather phenomena, you have scattered stations around the globe and nearly nothing on the ocean. With satellite data you at least have a grid, but the physicist in me says good luck finding an equation that can represent the physical reality you’re collecting data from. In too-simple terms, no boundary conditions = no understanding of what’s going on.

  53. Sam Urbinto
    Posted Mar 11, 2008 at 1:04 PM | Permalink

    What Jeremy said.

  54. Geoff Sherrington
    Posted Mar 11, 2008 at 6:48 PM | Permalink

    In a way, this should never have got to the point of statistics, except to probe for their applicability.

    In # 36 above I mentioned that temperature of trees and wind direction confounded. The tree is a living system with cycles of growth and rest that are impacted by temperature. There are times when a short blast of hot or cold out of season can limit or help growth for a year or more.

    Analogy: People need to drink water to avoid getting ill/dying. Over a month, one might drink a similar volume each day; or to argue by extreme, could drink at twice this rate in the first fortnight and drink nothing in last fortnight. Monthly average remains the same, conseqence is reduced growth, maybe permanently.

    Same with trees. Growth is affected not just by the average value of monthly temperature, but also by the distribution pattern of temperatures within each month. If the calibration period cannot infill the fine structure, the stats will not be capable of reconstructing the past.

    Having taken very rare trees from China to Australia, I know a little about coping wih seasonal changes (NH to SH in a day), so my comment contains practical as well as theory. Is that so novel?

  55. mccall
    Posted Mar 12, 2008 at 12:51 AM | Permalink

    re: 30
    5) RE vs. other correlation coef’s such as R2 (CE wasn’t determined and therefore not an issue)

  56. Ade
    Posted Mar 13, 2008 at 4:01 AM | Permalink

    I’m confused (not difficult!), and not being a statistician is not helping…

    There is much talk of the bristlecones being overweighted, that I understand & hence the reason the hockey stick appears – what I don’t quite get is how the weightings get to be where they are. Is it a deliberate choice by Mann et al, or is it some artifact of PCA analysis which up-weights certain datasets “automagically”?

    (Apologies if this is a dumb question, or is answered elsewhere)

    <

  57. Steve McIntyre
    Posted Mar 13, 2008 at 7:03 AM | Permalink

    #56. The problems are multi-layered. No one knows for sure what Mann knew about his method. There are issues relating to the application of correct PC methods to tree ring data sets. These problems are made worse by the erroneous Mannian method.

    With 20-20 hindsight, it would have been possible for Mann to have used a PC method which was less bad – in which case the issue would be squarely on the validity of applying conventional principal components to the North American tree ring data set as a means of obtaining a temperature proxy – which has never been established – and of the validity of bristlecones.

    Think of a murder victim with multiple stab wounds. The police arrest the husband and can prove that he stabbed his wife repeatedly and charge him with murdering his wife by stabbing her to death. Let’s now suppose that his defence is that he had already smothered his wife to death and the stabbing took place into her dead body so the claim that he murdered her by stabbing her to death was wrong. As I understand it, the count could be amended and the accused would not go free on such a technicality.

    Mann applied PC methodology applied to a tree ring network with multiple bristlecones. Trying to find the “real” problem in such a mess is like a complicate episode in CSI. The bristlecones are one problem; applying ordinary PCA to tree ring networks is another problem; Mannian PCA is another problem. Removing bristlecones from this nbetwork doesn’t necessarily improve things at all. I’m not saying that there’s a “right” way to get an answer out of this mess.

  58. Vincent
    Posted Mar 13, 2008 at 7:55 AM | Permalink

    Preisendorfer 1988

    The premise of a physical system …… of linear ordinary or of linear partial differential equations

    Tree ring response to temperature can hardly be described as “linear”. I would think “parabolic” is more apt…. The pretext for using Preisendorfer seems to collapse from the outset.

  59. Ade
    Posted Mar 13, 2008 at 4:21 PM | Permalink

    #57: Thanks Steve, I guess until Mann actually comes clean about exactly what data & methods he used, we shall never know….

  60. Posted Mar 14, 2008 at 9:44 AM | Permalink

    Mark T

    Perhaps the DC bias prior to 1400, such as during the MWP, which coincidentally shows up in most of the proxies (as I recall), would cause problems for the “warmest in a milluuuuun years” claim?

    Or lack of long-term cooling trend might be the reason, who knows ;)

Follow

Get every new post delivered to your Inbox.

Join 3,559 other followers

%d bloggers like this: