<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: More on Moberg</title>
	<atom:link href="http://climateaudit.org/2005/09/06/345/feed/" rel="self" type="application/rss+xml" />
	<link>http://climateaudit.org/2005/09/06/345/</link>
	<description>by Steve McIntyre</description>
	<lastBuildDate>Tue, 21 May 2013 15:32:22 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: TCO</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36782</link>
		<dc:creator><![CDATA[TCO]]></dc:creator>
		<pubDate>Sun, 23 Jul 2006 12:47:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36782</guid>
		<description><![CDATA[Ok, here is a more specific request.  (Still want the superman of PCA, but a more targeted request).  Ross has mentioned that there are a multitude of &quot;transforms&quot; that can be applied to data before putting them into PCA.  On this site we have discussed both subtraction of the mean (centered or off-centered) and division by the standard deviation.  Ross says there are more (and I&#039;ve even read on a blog a post by someone who advocates dividing by standard deviation twice or maybe it was twice the standard deviation...whatever).  Anyhow, I want the guru of data transforms.  Someone who is familiar with the general set of transforms and who can advise on the plusses and minuses of various transforms for various situations.]]></description>
		<content:encoded><![CDATA[<p>Ok, here is a more specific request.  (Still want the superman of PCA, but a more targeted request).  Ross has mentioned that there are a multitude of &#8220;transforms&#8221; that can be applied to data before putting them into PCA.  On this site we have discussed both subtraction of the mean (centered or off-centered) and division by the standard deviation.  Ross says there are more (and I&#8217;ve even read on a blog a post by someone who advocates dividing by standard deviation twice or maybe it was twice the standard deviation&#8230;whatever).  Anyhow, I want the guru of data transforms.  Someone who is familiar with the general set of transforms and who can advise on the plusses and minuses of various transforms for various situations.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: TCO</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36781</link>
		<dc:creator><![CDATA[TCO]]></dc:creator>
		<pubDate>Sun, 23 Jul 2006 12:37:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36781</guid>
		<description><![CDATA[Bender:  It&#039;s not meant as a vague answer.  I know that you are frustrated because you don&#039;t know the answer or even if a person exists, but I could say the same analagous thing about crystallography and come up with a few names for you.  Thanks for the R-code name.

What I want is someone who thinks about the subtleties in the method (even though it&#039;s an &quot;old development&quot;) rather then just using it like a technician.  There are people at the very, very highest levels of science who push the &quot;I beleive&quot; button.  Sometimes that is fine and efficient.  Other times, you want someone who always wonders what&#039;s under the hood.  In the case of PCA, the methods and assumption thinking would need someone with both knowledge of theoretical and application pitfalls.]]></description>
		<content:encoded><![CDATA[<p>Bender:  It&#8217;s not meant as a vague answer.  I know that you are frustrated because you don&#8217;t know the answer or even if a person exists, but I could say the same analagous thing about crystallography and come up with a few names for you.  Thanks for the R-code name.</p>
<p>What I want is someone who thinks about the subtleties in the method (even though it&#8217;s an &#8220;old development&#8221;) rather then just using it like a technician.  There are people at the very, very highest levels of science who push the &#8220;I beleive&#8221; button.  Sometimes that is fine and efficient.  Other times, you want someone who always wonders what&#8217;s under the hood.  In the case of PCA, the methods and assumption thinking would need someone with both knowledge of theoretical and application pitfalls.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MarkR</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36780</link>
		<dc:creator><![CDATA[MarkR]]></dc:creator>
		<pubDate>Sun, 23 Jul 2006 08:56:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36780</guid>
		<description><![CDATA[Just to summrise. These papers Moberg, and Mann et al, seem to use the same basic strategy.

1 Find a number of different proxy (or alleged proxy) datasets.
2 Make sure you include one (Mann et al Bristlecone), or two (Moberg #1 and #11) that have a long flat line, folowed by a steep rise in the 20th century.
3 Use whatever statistical technique is available to overwight the specially selected (for HocketStickness) proxies, and drown out as far as possible the information from the others.
4 Don&#039;t use the standard statistical tests for robustness, either ignore the inconvenient ones, or invent some new one.
5 Voila, you have a peer reviewed, multi proxy, statistically robust Hockey Stick.

PS If I had my way, every one of these Hockey Stick papers should have a qualified statistician on board, and also be reviewed by at least one.

The reason the climatologists don&#039;t include the statisticians is because this is almost entirely statistical work, and they would have to credut the statistician accordingly.

The climatologists doing this work are merely recycling other peoples data, using statistical methods that as SteveM has shown, time and again, the climatologists have no real understanding of.]]></description>
		<content:encoded><![CDATA[<p>Just to summrise. These papers Moberg, and Mann et al, seem to use the same basic strategy.</p>
<p>1 Find a number of different proxy (or alleged proxy) datasets.<br />
2 Make sure you include one (Mann et al Bristlecone), or two (Moberg #1 and #11) that have a long flat line, folowed by a steep rise in the 20th century.<br />
3 Use whatever statistical technique is available to overwight the specially selected (for HocketStickness) proxies, and drown out as far as possible the information from the others.<br />
4 Don&#8217;t use the standard statistical tests for robustness, either ignore the inconvenient ones, or invent some new one.<br />
5 Voila, you have a peer reviewed, multi proxy, statistically robust Hockey Stick.</p>
<p>PS If I had my way, every one of these Hockey Stick papers should have a qualified statistician on board, and also be reviewed by at least one.</p>
<p>The reason the climatologists don&#8217;t include the statisticians is because this is almost entirely statistical work, and they would have to credut the statistician accordingly.</p>
<p>The climatologists doing this work are merely recycling other peoples data, using statistical methods that as SteveM has shown, time and again, the climatologists have no real understanding of.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MarkR</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36779</link>
		<dc:creator><![CDATA[MarkR]]></dc:creator>
		<pubDate>Sun, 23 Jul 2006 08:25:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36779</guid>
		<description><![CDATA[Re#17 I&#039;m afraid I&#039;m not a statistician, so it&#039;s difficult to put it in correct terminology, but I have read the Wegman Report, and will use parts of it to illustrate what I mean.

1  Why use PCA at all? My understanding is it is a method for consolidating a large number of data series, but Moberg only has 11.

from &lt;a href=&quot;http://energycommerce.house.gov/108/home/07142006_Wegman_Report.pdf&quot; rel=&quot;nofollow&quot;&gt;Wegman&lt;/a&gt;:

Principal Components
Principal Component Analysis (PCA) is a method for reducing the dimension of a high
dimensional data set while preserving most of the information in those data. Dimension is here taken to mean the number of distinct variables (proxies). In the context of paleoclimatology, the proxy variables are the high dimensional data set consisting of several time series that are intended to carry the temperature signal. The proxy data set in general will have a large number of interrelated or correlated variables. Principal component analysis tries to reduce the dimensionality of this data set while also trying to explain the variation present as much as possible. To achieve this, the original set of variables is transformed into a new set of variables, called the principal components (PC) that are uncorrelated and arranged in the order of decreasing &quot;explained variance.&quot; It is hoped that the first several PCs explain most of the variation that was present in the many original variables. The idea is that if most of the variation is explained by the first several principal components, then the remaining principal components may be ignored for all practical purposes and the dimension of the data set is effectively reduced.

2  What is the effect of choosing a limited calibration period with rising temperatures? And any subsequent recentering?

From Wegman again:

Principal component methods are normally structured so that each of the data time series (proxy data series) are centered on their respective means and appropriately scaled. The first principal component attempts to discover the composite series that explains the maximum amount of variance. The second principal component is another composite series that is uncorrelated with the first and that seeks to explain as much of the remaining variance as possible. The third, fourth and so on follow in a similar way. In MBH98/99 the authors make a simple seemingly innocuous and somewhat obscure calibration assumption. &lt;b&gt;Because the instrumental temperature records are only available for a limited window, they use instrumental temperature data from  1902-1995 to calibrate the proxy data set. This would seem reasonable except for the fact that temperatures were rising during this period. So that centering on this period has the effect of making the mean value for any proxy series exhibiting the same increasing trend to be decentered low. Because the proxy series exhibiting the rising trend are decentered, their calculated variance will be larger than their normal variance when calculated based on centered data, and hence they will tend to be selected preferentially as the first principal component.&lt;/b&gt; (In fact the effect of this can clearly be seen RPC no. 1 in Figure 5 in MBH98.). Thus, in effect, any proxy series that exhibits a rising trend in the calibration period will be preferentially added to the first principal component.... The net effect of the decentering is to preferentially choose the so-called hockey stick shapes.]]></description>
		<content:encoded><![CDATA[<p>Re#17 I&#8217;m afraid I&#8217;m not a statistician, so it&#8217;s difficult to put it in correct terminology, but I have read the Wegman Report, and will use parts of it to illustrate what I mean.</p>
<p>1  Why use PCA at all? My understanding is it is a method for consolidating a large number of data series, but Moberg only has 11.</p>
<p>from <a href="http://energycommerce.house.gov/108/home/07142006_Wegman_Report.pdf" rel="nofollow">Wegman</a>:</p>
<p>Principal Components<br />
Principal Component Analysis (PCA) is a method for reducing the dimension of a high<br />
dimensional data set while preserving most of the information in those data. Dimension is here taken to mean the number of distinct variables (proxies). In the context of paleoclimatology, the proxy variables are the high dimensional data set consisting of several time series that are intended to carry the temperature signal. The proxy data set in general will have a large number of interrelated or correlated variables. Principal component analysis tries to reduce the dimensionality of this data set while also trying to explain the variation present as much as possible. To achieve this, the original set of variables is transformed into a new set of variables, called the principal components (PC) that are uncorrelated and arranged in the order of decreasing &#8220;explained variance.&#8221; It is hoped that the first several PCs explain most of the variation that was present in the many original variables. The idea is that if most of the variation is explained by the first several principal components, then the remaining principal components may be ignored for all practical purposes and the dimension of the data set is effectively reduced.</p>
<p>2  What is the effect of choosing a limited calibration period with rising temperatures? And any subsequent recentering?</p>
<p>From Wegman again:</p>
<p>Principal component methods are normally structured so that each of the data time series (proxy data series) are centered on their respective means and appropriately scaled. The first principal component attempts to discover the composite series that explains the maximum amount of variance. The second principal component is another composite series that is uncorrelated with the first and that seeks to explain as much of the remaining variance as possible. The third, fourth and so on follow in a similar way. In MBH98/99 the authors make a simple seemingly innocuous and somewhat obscure calibration assumption. <b>Because the instrumental temperature records are only available for a limited window, they use instrumental temperature data from  1902-1995 to calibrate the proxy data set. This would seem reasonable except for the fact that temperatures were rising during this period. So that centering on this period has the effect of making the mean value for any proxy series exhibiting the same increasing trend to be decentered low. Because the proxy series exhibiting the rising trend are decentered, their calculated variance will be larger than their normal variance when calculated based on centered data, and hence they will tend to be selected preferentially as the first principal component.</b> (In fact the effect of this can clearly be seen RPC no. 1 in Figure 5 in MBH98.). Thus, in effect, any proxy series that exhibits a rising trend in the calibration period will be preferentially added to the first principal component&#8230;. The net effect of the decentering is to preferentially choose the so-called hockey stick shapes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: bender</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36778</link>
		<dc:creator><![CDATA[bender]]></dc:creator>
		<pubDate>Sun, 23 Jul 2006 04:17:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36778</guid>
		<description><![CDATA[That&#039;s about as vague an answer as I thought I would get. I&#039;m really not qualified to judge. But if I were stuck, I would suggest checking out names of those who contribute code to R, especially members of the R core team. If they can&#039;t help you directly, they will know someone who can. The brightest statistician in the world may well be the guy at the top of it all, Brian Ripley. But what do I know? I used to folllow their online help group for a number of years, and he was incredibly critical, brilliantly insightful. They have an online archive where you will find questions and answers to you-name-it, including PCA. That will give you the names of the people who are giving out all the answers.]]></description>
		<content:encoded><![CDATA[<p>That&#8217;s about as vague an answer as I thought I would get. I&#8217;m really not qualified to judge. But if I were stuck, I would suggest checking out names of those who contribute code to R, especially members of the R core team. If they can&#8217;t help you directly, they will know someone who can. The brightest statistician in the world may well be the guy at the top of it all, Brian Ripley. But what do I know? I used to folllow their online help group for a number of years, and he was incredibly critical, brilliantly insightful. They have an online archive where you will find questions and answers to you-name-it, including PCA. That will give you the names of the people who are giving out all the answers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: TCO</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36777</link>
		<dc:creator><![CDATA[TCO]]></dc:creator>
		<pubDate>Sun, 23 Jul 2006 01:54:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36777</guid>
		<description><![CDATA[I want the name of a person who knows the thing inside and out from multiple angles.  What he learned in the books to use it.  The theory behind it.  Closely related variants.  Usage and misusage in the field.]]></description>
		<content:encoded><![CDATA[<p>I want the name of a person who knows the thing inside and out from multiple angles.  What he learned in the books to use it.  The theory behind it.  Closely related variants.  Usage and misusage in the field.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: bender</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36776</link>
		<dc:creator><![CDATA[bender]]></dc:creator>
		<pubDate>Sun, 23 Jul 2006 00:11:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36776</guid>
		<description><![CDATA[Strict-sense PCA? It is so standard a technique you would likely get a very long list of potential candidates. Presumably you want a reviewer for something? Explain more precisely what you want critiqued and it will be easier to come up with a good list of possibilities. Are you sure it isn&#039;t some other aspect of climate reconstruction that you want reviewed, like Mannomatic/RegEM &quot;training&quot; methods? Be precise.

Also, it would help to know who you&#039;re up against if you want to go one (or more) better.]]></description>
		<content:encoded><![CDATA[<p>Strict-sense PCA? It is so standard a technique you would likely get a very long list of potential candidates. Presumably you want a reviewer for something? Explain more precisely what you want critiqued and it will be easier to come up with a good list of possibilities. Are you sure it isn&#8217;t some other aspect of climate reconstruction that you want reviewed, like Mannomatic/RegEM &#8220;training&#8221; methods? Be precise.</p>
<p>Also, it would help to know who you&#8217;re up against if you want to go one (or more) better.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: TCO</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36775</link>
		<dc:creator><![CDATA[TCO]]></dc:creator>
		<pubDate>Sat, 22 Jul 2006 23:48:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36775</guid>
		<description><![CDATA[Who is the best expert on PCA in the country?  Somebody who knows both theory, application and misapplication.]]></description>
		<content:encoded><![CDATA[<p>Who is the best expert on PCA in the country?  Somebody who knows both theory, application and misapplication.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: bender</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36774</link>
		<dc:creator><![CDATA[bender]]></dc:creator>
		<pubDate>Sat, 22 Jul 2006 23:27:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36774</guid>
		<description><![CDATA[Re: #16
Will read the article on Monday to get the full context. (e.g. to see how that sentence ends, among other things)

But for the moment,

&lt;blockquote&gt;&quot;scaling its variance and adjusting its mean value so that these become identical to those in the instrumental record&quot; &lt;/blockquote&gt;

means nothing more than that: they rescaled the mean and rescaled the variance of some reconstruction vector to match that of the NH temperature record.

You disagree with rescaling for some reason? If t is temperature and p is proxy, then linear sensitivity of p to t would yield:

p=a*t+b (response function)

or

t&#039;=(p-b)/a (restated as calibration function)

Rescaling just means figuring out values for a (which controls variance of the series) and b (controls mean). Rescaling has no effect on the shape (autocorrelation structure, information content, whatever you want to call it) of the reconstituted t&#039;.

Maybe I&#039;m missing something? Like I said, I need to read a few papers.

...

P.S. If you have a very specific question in mind (e.g. dependent on the context provided by some paper), but phrase it in vague or general times, then you&#039;re not going to get the answer your seeking. If original point #14 had made reference to the paper I would have known there was context to the remark. As with TCO on the issue of how many PCs to interpret, you were after something very specific, but didn&#039;t provide the necessary context.

Not complaining. Just pointing out that that&#039;s why it takes 2-3 takes for me to understand what it is you&#039;re *really* after.]]></description>
		<content:encoded><![CDATA[<p>Re: #16<br />
Will read the article on Monday to get the full context. (e.g. to see how that sentence ends, among other things)</p>
<p>But for the moment,</p>
<blockquote><p>&#8220;scaling its variance and adjusting its mean value so that these become identical to those in the instrumental record&#8221; </p></blockquote>
<p>means nothing more than that: they rescaled the mean and rescaled the variance of some reconstruction vector to match that of the NH temperature record.</p>
<p>You disagree with rescaling for some reason? If t is temperature and p is proxy, then linear sensitivity of p to t would yield:</p>
<p>p=a*t+b (response function)</p>
<p>or</p>
<p>t&#8217;=(p-b)/a (restated as calibration function)</p>
<p>Rescaling just means figuring out values for a (which controls variance of the series) and b (controls mean). Rescaling has no effect on the shape (autocorrelation structure, information content, whatever you want to call it) of the reconstituted t&#8217;.</p>
<p>Maybe I&#8217;m missing something? Like I said, I need to read a few papers.</p>
<p>&#8230;</p>
<p>P.S. If you have a very specific question in mind (e.g. dependent on the context provided by some paper), but phrase it in vague or general times, then you&#8217;re not going to get the answer your seeking. If original point #14 had made reference to the paper I would have known there was context to the remark. As with TCO on the issue of how many PCs to interpret, you were after something very specific, but didn&#8217;t provide the necessary context.</p>
<p>Not complaining. Just pointing out that that&#8217;s why it takes 2-3 takes for me to understand what it is you&#8217;re *really* after.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MarkR</title>
		<link>http://climateaudit.org/2005/09/06/345/#comment-36773</link>
		<dc:creator><![CDATA[MarkR]]></dc:creator>
		<pubDate>Sat, 22 Jul 2006 20:53:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=345#comment-36773</guid>
		<description><![CDATA[&quot;We calibrated the Northern Hemisphere (NH) reconstruction by scaling its variance and adjusting its mean value so that these become identical to those in the instrumental record of NH annual mean temperatures in the overlapping period &lt;b&gt;&lt;em&gt;1856-1979&lt;/em&gt;&lt;/b&gt;

Supplementary methods

&lt;a href=&quot;http://www.nature.com/nature/journal/v433/n7026/suppinfo/nature03265.html&quot; rel=&quot;nofollow&quot;&gt;link&lt;/a&gt;

Isn&#039;t calibrating to strongly rising temperatures the same flaw as Mann et al?
Also, I don&#039;t see any mention of RE, and r2 results.
Surely what we have here is more spurious correlation, and spurious calibration?]]></description>
		<content:encoded><![CDATA[<p>&#8220;We calibrated the Northern Hemisphere (NH) reconstruction by scaling its variance and adjusting its mean value so that these become identical to those in the instrumental record of NH annual mean temperatures in the overlapping period <b><em>1856-1979</em></b></p>
<p>Supplementary methods</p>
<p><a href="http://www.nature.com/nature/journal/v433/n7026/suppinfo/nature03265.html" rel="nofollow">link</a></p>
<p>Isn&#8217;t calibrating to strongly rising temperatures the same flaw as Mann et al?<br />
Also, I don&#8217;t see any mention of RE, and r2 results.<br />
Surely what we have here is more spurious correlation, and spurious calibration?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
