<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: MBH and Partial Least Squares</title>
	<atom:link href="http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/feed/" rel="self" type="application/rss+xml" />
	<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/</link>
	<description>by Steve McIntyre</description>
	<lastBuildDate>Thu, 20 Jun 2013 06:24:11 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Lawrence Hickey</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52751</link>
		<dc:creator><![CDATA[Lawrence Hickey]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 19:47:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52751</guid>
		<description><![CDATA[Ok I&#039;m up for another attempt. Wonder if John A will have mercy on my embarasement  trying to get the tex fragments to work
but I&#039;m up for another one. Here goes

Dear Blog:
First: can you repeat where the MBH raw data is archived?

Second:
I have a fair grasp of linear algebra, but very limited knowledge of PCA, and through that lens,
I have this idealized- over simplified overview of the reconstruction problem. Help me with any misunderstandings.

Suppose we have three proxys (for simplicity) P1,P2,P3.
Suppose the raw data is a table with 5 column headings, with T being temperature.

[date, P1 ,P2 ,P3 ,T]

The different proxy columns may have different units, Tree ring width might be one example.

Sorting the table by date, most recent first, the temperatures are missing for rows   $latex  r &gt; rv $. (the reconstruction period).
Take the first m rows  as the calibration period.
Assume we have temperature values for rows in the calibration period.
Rows between  m and  rv form the validation period which also have temperature values.
Let $latex (p_1,p_2,p_3) $ be the value of the three proxys for a particular row,


I would think the first job is to find a function f  to take an arbitrary $latex (p_1,p_2,p_3) $ to a temperature.
$latex f:(p_1,p_2,p_3) \rightarrow \hat{T} $.
I would form a m by n matrix A, with n=3, with columns P1,P2,P3. Then form  a m by 1 row column vector T from the corresponding m known temperatures
from the raw data table
and solve $latex Ax \hat{=} T $ in the least square sense. This means finding the weights vector  $latex x=x_1,x_2,x_3 $ such that $latex  Ax=\hat{T} $
where $latex \hat{T} $ is the projection of T onto the column space of A.

With this function f, generated using rows with dates in the calibration period, to examine length of the error vector $latex \hat{T} - T $
based on rows in the validation date range, to the  that based on rows in the calibration date range.
Or we can  form the correlation coefficient $latex \rho = \frac{\mbox{cov}(T,\hat{T})}{\sqrt{ \mbox{var}{(T)},\mbox{var}{(\hat{T})}}} $
over the validation date range to check that $latex \hat{T} $ and $latex T $ are correlated  when we move outside the calibration period
to include validation period dates as well.

Using function f, we can get projected temperatures $latex  \hat{T} $ for any date in the raw table.
From here draw the graphs of date by f(p1,p2,p3)=T for dates into the reconstruction period as well.
I guess one can use moving averages etc when graphing but thats a detail.

Reconstruction completed.

To get x, one can use either the SVD factorization of $latex A=U \Sigma V^T $ or use the older method of
$latex  A^T A x = A^{T} T  $  which factors positive definite $latex  A^T A  $ to get
$latex {(A^T A)}^{-1} = Q {\Lambda}^{-1} Q^T $. Multiply both sides of $latex A^T A x =  A^T\ T  $ by this to get weight vector x.

With $latex  A^T A = Q \Lambda Q^T  $ the eigenvectors are in the columns of Q and the eigenvalues in the diagonal $latex  \Lambda $,
one might chop  some of the smaller eigenvalues  in $latex  \Lambda $ to zero, before forming  $latex  {\Lambda}^{-1} $ for more smoothness
but that&#039;s the extent or manual intervention (fiddling) with this approach.

My question is : Where does principal component analysis fit in at all? It seems like its beside the point. How does this help to get function f
that takes us from a proxy row $latex p_1,p_2,p_3 $ to a temperature $latex \hat T $?

One could process A by taking each proxy column, and subtracting its column mean from each element and dividing by the column variance.
to make $latex A^T A = R $ into a correlation matrix R, and try to find your factor maxrix F such that  $latex R=F F^T + D $, D a diagonal  matrix
for unassigned variance.
But we introduce the
human element into a selection of how such a factorization is done to account for the all the correlations in R. In recasting R,
(F might be a 3 by 2 matrix in the baby example we cook down the number of proxys to 2) we account for as much of R as we can with a
reduced number of synthesized factors F. But to wnat end?

Please don&#039;t tell me that they take the  $latex F^T F $ instead of A, and use the usual least squares method to construct the weight vector x.
This can&#039;t be what is done by MBH, is it?


Lastly, please explain what the 100 and 200 ... year windows are.
I would like to get the actual data and fiddle around with it myself.

~]]></description>
		<content:encoded><![CDATA[<p>Ok I&#8217;m up for another attempt. Wonder if John A will have mercy on my embarasement  trying to get the tex fragments to work<br />
but I&#8217;m up for another one. Here goes</p>
<p>Dear Blog:<br />
First: can you repeat where the MBH raw data is archived?</p>
<p>Second:<br />
I have a fair grasp of linear algebra, but very limited knowledge of PCA, and through that lens,<br />
I have this idealized- over simplified overview of the reconstruction problem. Help me with any misunderstandings.</p>
<p>Suppose we have three proxys (for simplicity) P1,P2,P3.<br />
Suppose the raw data is a table with 5 column headings, with T being temperature.</p>
<p>[date, P1 ,P2 ,P3 ,T]</p>
<p>The different proxy columns may have different units, Tree ring width might be one example.</p>
<p>Sorting the table by date, most recent first, the temperatures are missing for rows   <img src='http://s0.wp.com/latex.php?latex=r+%3E+rv+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='r &gt; rv ' title='r &gt; rv ' class='latex' />. (the reconstruction period).<br />
Take the first m rows  as the calibration period.<br />
Assume we have temperature values for rows in the calibration period.<br />
Rows between  m and  rv form the validation period which also have temperature values.<br />
Let <img src='http://s0.wp.com/latex.php?latex=%28p_1%2Cp_2%2Cp_3%29+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='(p_1,p_2,p_3) ' title='(p_1,p_2,p_3) ' class='latex' /> be the value of the three proxys for a particular row,</p>
<p>I would think the first job is to find a function f  to take an arbitrary <img src='http://s0.wp.com/latex.php?latex=%28p_1%2Cp_2%2Cp_3%29+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='(p_1,p_2,p_3) ' title='(p_1,p_2,p_3) ' class='latex' /> to a temperature.<br />
<img src='http://s0.wp.com/latex.php?latex=f%3A%28p_1%2Cp_2%2Cp_3%29+%5Crightarrow+%5Chat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='f:(p_1,p_2,p_3) &#92;rightarrow &#92;hat{T} ' title='f:(p_1,p_2,p_3) &#92;rightarrow &#92;hat{T} ' class='latex' />.<br />
I would form a m by n matrix A, with n=3, with columns P1,P2,P3. Then form  a m by 1 row column vector T from the corresponding m known temperatures<br />
from the raw data table<br />
and solve <img src='http://s0.wp.com/latex.php?latex=Ax+%5Chat%7B%3D%7D+T+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='Ax &#92;hat{=} T ' title='Ax &#92;hat{=} T ' class='latex' /> in the least square sense. This means finding the weights vector  <img src='http://s0.wp.com/latex.php?latex=x%3Dx_1%2Cx_2%2Cx_3+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='x=x_1,x_2,x_3 ' title='x=x_1,x_2,x_3 ' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=Ax%3D%5Chat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='Ax=&#92;hat{T} ' title='Ax=&#92;hat{T} ' class='latex' /><br />
where <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;hat{T} ' title='&#92;hat{T} ' class='latex' /> is the projection of T onto the column space of A.</p>
<p>With this function f, generated using rows with dates in the calibration period, to examine length of the error vector <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BT%7D+-+T+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;hat{T} - T ' title='&#92;hat{T} - T ' class='latex' /><br />
based on rows in the validation date range, to the  that based on rows in the calibration date range.<br />
Or we can  form the correlation coefficient <img src='http://s0.wp.com/latex.php?latex=%5Crho+%3D+%5Cfrac%7B%5Cmbox%7Bcov%7D%28T%2C%5Chat%7BT%7D%29%7D%7B%5Csqrt%7B+%5Cmbox%7Bvar%7D%7B%28T%29%7D%2C%5Cmbox%7Bvar%7D%7B%28%5Chat%7BT%7D%29%7D%7D%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;rho = &#92;frac{&#92;mbox{cov}(T,&#92;hat{T})}{&#92;sqrt{ &#92;mbox{var}{(T)},&#92;mbox{var}{(&#92;hat{T})}}} ' title='&#92;rho = &#92;frac{&#92;mbox{cov}(T,&#92;hat{T})}{&#92;sqrt{ &#92;mbox{var}{(T)},&#92;mbox{var}{(&#92;hat{T})}}} ' class='latex' /><br />
over the validation date range to check that <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;hat{T} ' title='&#92;hat{T} ' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=T+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='T ' title='T ' class='latex' /> are correlated  when we move outside the calibration period<br />
to include validation period dates as well.</p>
<p>Using function f, we can get projected temperatures <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;hat{T} ' title='&#92;hat{T} ' class='latex' /> for any date in the raw table.<br />
From here draw the graphs of date by f(p1,p2,p3)=T for dates into the reconstruction period as well.<br />
I guess one can use moving averages etc when graphing but thats a detail.</p>
<p>Reconstruction completed.</p>
<p>To get x, one can use either the SVD factorization of <img src='http://s0.wp.com/latex.php?latex=A%3DU+%5CSigma+V%5ET+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A=U &#92;Sigma V^T ' title='A=U &#92;Sigma V^T ' class='latex' /> or use the older method of<br />
<img src='http://s0.wp.com/latex.php?latex=A%5ET+A+x+%3D+A%5E%7BT%7D+T++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A x = A^{T} T  ' title='A^T A x = A^{T} T  ' class='latex' />  which factors positive definite <img src='http://s0.wp.com/latex.php?latex=A%5ET+A++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A  ' title='A^T A  ' class='latex' /> to get<br />
<img src='http://s0.wp.com/latex.php?latex=%7B%28A%5ET+A%29%7D%5E%7B-1%7D+%3D+Q+%7B%5CLambda%7D%5E%7B-1%7D+Q%5ET+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='{(A^T A)}^{-1} = Q {&#92;Lambda}^{-1} Q^T ' title='{(A^T A)}^{-1} = Q {&#92;Lambda}^{-1} Q^T ' class='latex' />. Multiply both sides of <img src='http://s0.wp.com/latex.php?latex=A%5ET+A+x+%3D++A%5ET%5C+T++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A x =  A^T&#92; T  ' title='A^T A x =  A^T&#92; T  ' class='latex' /> by this to get weight vector x.</p>
<p>With <img src='http://s0.wp.com/latex.php?latex=A%5ET+A+%3D+Q+%5CLambda+Q%5ET++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A = Q &#92;Lambda Q^T  ' title='A^T A = Q &#92;Lambda Q^T  ' class='latex' /> the eigenvectors are in the columns of Q and the eigenvalues in the diagonal <img src='http://s0.wp.com/latex.php?latex=%5CLambda+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Lambda ' title='&#92;Lambda ' class='latex' />,<br />
one might chop  some of the smaller eigenvalues  in <img src='http://s0.wp.com/latex.php?latex=%5CLambda+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Lambda ' title='&#92;Lambda ' class='latex' /> to zero, before forming  <img src='http://s0.wp.com/latex.php?latex=%7B%5CLambda%7D%5E%7B-1%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='{&#92;Lambda}^{-1} ' title='{&#92;Lambda}^{-1} ' class='latex' /> for more smoothness<br />
but that&#8217;s the extent or manual intervention (fiddling) with this approach.</p>
<p>My question is : Where does principal component analysis fit in at all? It seems like its beside the point. How does this help to get function f<br />
that takes us from a proxy row <img src='http://s0.wp.com/latex.php?latex=p_1%2Cp_2%2Cp_3+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='p_1,p_2,p_3 ' title='p_1,p_2,p_3 ' class='latex' /> to a temperature <img src='http://s0.wp.com/latex.php?latex=%5Chat+T+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;hat T ' title='&#92;hat T ' class='latex' />?</p>
<p>One could process A by taking each proxy column, and subtracting its column mean from each element and dividing by the column variance.<br />
to make <img src='http://s0.wp.com/latex.php?latex=A%5ET+A+%3D+R+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A = R ' title='A^T A = R ' class='latex' /> into a correlation matrix R, and try to find your factor maxrix F such that  <img src='http://s0.wp.com/latex.php?latex=R%3DF+F%5ET+%2B+D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='R=F F^T + D ' title='R=F F^T + D ' class='latex' />, D a diagonal  matrix<br />
for unassigned variance.<br />
But we introduce the<br />
human element into a selection of how such a factorization is done to account for the all the correlations in R. In recasting R,<br />
(F might be a 3 by 2 matrix in the baby example we cook down the number of proxys to 2) we account for as much of R as we can with a<br />
reduced number of synthesized factors F. But to wnat end?</p>
<p>Please don&#8217;t tell me that they take the  <img src='http://s0.wp.com/latex.php?latex=F%5ET+F+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='F^T F ' title='F^T F ' class='latex' /> instead of A, and use the usual least squares method to construct the weight vector x.<br />
This can&#8217;t be what is done by MBH, is it?</p>
<p>Lastly, please explain what the 100 and 200 &#8230; year windows are.<br />
I would like to get the actual data and fiddle around with it myself.</p>
<p>~</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: J. Sperry</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52750</link>
		<dc:creator><![CDATA[J. Sperry]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 19:29:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52750</guid>
		<description><![CDATA[LH, that request would be for John A or Steve M.  While it&#039;s amusing to consider, TCO doesn&#039;t have that power.]]></description>
		<content:encoded><![CDATA[<p>LH, that request would be for John A or Steve M.  While it&#8217;s amusing to consider, TCO doesn&#8217;t have that power.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lawrence Hickey</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52749</link>
		<dc:creator><![CDATA[Lawrence Hickey]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 19:20:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52749</guid>
		<description><![CDATA[Tco - can you remove 30-37 and I will post one that works. Sorry.]]></description>
		<content:encoded><![CDATA[<p>Tco &#8211; can you remove 30-37 and I will post one that works. Sorry.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lawrence Hickey</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52748</link>
		<dc:creator><![CDATA[Lawrence Hickey]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 19:13:48 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52748</guid>
		<description><![CDATA[yup- wants hat not Hat. This thing is a disgrace. Tco can you remove the abortive attempts an I will send it again with small hat. Thanks .]]></description>
		<content:encoded><![CDATA[<p>yup- wants hat not Hat. This thing is a disgrace. Tco can you remove the abortive attempts an I will send it again with small hat. Thanks .</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lawrence Hickey</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52747</link>
		<dc:creator><![CDATA[Lawrence Hickey]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 19:11:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52747</guid>
		<description><![CDATA[doesnt like  \Hat much either does it TCO
wonder if small hat would be better . lets see.
$latex  \hat{T}  $]]></description>
		<content:encoded><![CDATA[<p>doesnt like  \Hat much either does it TCO<br />
wonder if small hat would be better . lets see.<br />
<img src='http://s0.wp.com/latex.php?latex=%5Chat%7BT%7D++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;hat{T}  ' title='&#92;hat{T}  ' class='latex' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lawrence Hickey</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52746</link>
		<dc:creator><![CDATA[Lawrence Hickey]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 19:09:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52746</guid>
		<description><![CDATA[again into the breech :

Dear Blog:
First: can you repeat where the MBH raw data is archived?

Second:
I have a fair grasp of linear algebra, but very limited knowledge of PCA, and through that lens,
I have this idealized- over simplified overview of the reconstruction problem. Help me with any misunderstandings.

Suppose we have three proxys (for simplicity) P1,P2,P3.
Suppose the raw data is a table with 5 column headings, with T being temperature.

[date, P1 ,P2 ,P3 ,T]

The different proxy columns may have different units, Tree ring width might be one example.

Sorting the table by date, most recent first, the temperatures are missing for rows   $latex  r &gt; rv $. (the reconstruction period).
Take the first m rows  as the calibration period.
Assume we have temperature values for rows in the calibration period.
Rows between  m and  rv form the validation period which also have temperature values.
Let $latex (p_1,p_2,p_3) $ be the value of the three proxys for a particular row,


I would think the first job is to find a function f  to take an arbitrary $latex (p_1,p_2,p_3) $ to a temperature.
$latex f:(p_1,p_2,p_3) \rightarrow \Hat{T} $.
I would form a m by n matrix A, with n=3, with columns P1,P2,P3. Then form  a m by 1 row column vector T from the corresponding m known temperatures
from the raw data table
and solve $latex Ax \Hat{=} T $ in the least square sense. This means finding the weights vector  $latex x=x_1,x_2,x_3 $ such that $latex  Ax=\Hat{T} $
where $latex \Hat{T} $ is the projection of T onto the column space of A.

With this function f, generated using rows with dates in the calibration period, to examine length of the error vector $latex \Hat{T} - T $
based on rows in the validation date range, to the  that based on rows in the calibration date range.
Or we can  form the correlation coefficient $latex \rho = \frac{\mbox{cov}(T,\Hat{T})}{\sqrt{ \mbox{var}{(T)},\mbox{var}{(\hat{T})}}} $
over the validation date range to check that $latex \Hat{T} $ and $latex T $ are correlated  when we move outside the calibration period
to include validation period dates as well.

Using function f, we can get projected temperatures $latex  \Hat{T} $ for any date in the raw table.
From here draw the graphs of date by f(p1,p2,p3)=T for dates into the reconstruction period as well.
I guess one can use moving averages etc when graphing but thats a detail.

Reconstruction completed.

To get x, one can use either the SVD factorization of $latex A=U \Sigma V^T $ or use the older method of
$latex  A^T A x = A^{T} T  $  which factors positive definite $latex  A^T A  $ to get
$latex {(A^T A)}^{-1} = Q {\Lambda}^{-1} Q^T $. Multiply both sides of $latex A^T A x =  A^T\ T  $ by this to get weight vector x.

With $latex  A^T A = Q \Lambda Q^T  $ the eigenvectors are in the columns of Q and the eigenvalues in the diagonal $latex  \Lambda $,
one might chop  some of the smaller eigenvalues  in $latex  \Lambda $ to zero, before forming  $latex  {\Lambda}^{-1} $ for more smoothness
but that&#039;s the extent or manual intervention (fiddling) with this approach.

My question is : Where does principal component analysis fit in at all? It seems like its beside the point. How does this help to get function f
that takes us from a proxy row $latex p_1,p_2,p_3 $ to a temperature $latex \Hat T $?

One could process A by taking each proxy column, and subtracting its column mean from each element and dividing by the column variance.
to make $latex A^T A = R $ into a correlation matrix R, and try to find your factor maxrix F such that  $latex R=F F^T + D $, D a diagonal  matrix
for unassigned variance.
But we introduce the
human element into a selection of how such a factorization is done to account for the all the correlations in R. In recasting R,
(F might be a 3 by 2 matrix in the baby example we cook down the number of proxys to 2) we account for as much of R as we can with a
reduced number of synthesized factors F. But to wnat end?

Please don&#039;t tell me that they take the  $latex F^T F $ instead of A, and use the usual least squares method to construct the weight vector x.
This can&#039;t be what is done by MBH, is it?


Lastly, please explain what the 100 and 200 ... year windows are.
I would like to get the actual data and fiddle around with it myself.]]></description>
		<content:encoded><![CDATA[<p>again into the breech :</p>
<p>Dear Blog:<br />
First: can you repeat where the MBH raw data is archived?</p>
<p>Second:<br />
I have a fair grasp of linear algebra, but very limited knowledge of PCA, and through that lens,<br />
I have this idealized- over simplified overview of the reconstruction problem. Help me with any misunderstandings.</p>
<p>Suppose we have three proxys (for simplicity) P1,P2,P3.<br />
Suppose the raw data is a table with 5 column headings, with T being temperature.</p>
<p>[date, P1 ,P2 ,P3 ,T]</p>
<p>The different proxy columns may have different units, Tree ring width might be one example.</p>
<p>Sorting the table by date, most recent first, the temperatures are missing for rows   <img src='http://s0.wp.com/latex.php?latex=r+%3E+rv+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='r &gt; rv ' title='r &gt; rv ' class='latex' />. (the reconstruction period).<br />
Take the first m rows  as the calibration period.<br />
Assume we have temperature values for rows in the calibration period.<br />
Rows between  m and  rv form the validation period which also have temperature values.<br />
Let <img src='http://s0.wp.com/latex.php?latex=%28p_1%2Cp_2%2Cp_3%29+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='(p_1,p_2,p_3) ' title='(p_1,p_2,p_3) ' class='latex' /> be the value of the three proxys for a particular row,</p>
<p>I would think the first job is to find a function f  to take an arbitrary <img src='http://s0.wp.com/latex.php?latex=%28p_1%2Cp_2%2Cp_3%29+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='(p_1,p_2,p_3) ' title='(p_1,p_2,p_3) ' class='latex' /> to a temperature.<br />
<img src='http://s0.wp.com/latex.php?latex=f%3A%28p_1%2Cp_2%2Cp_3%29+%5Crightarrow+%5CHat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='f:(p_1,p_2,p_3) &#92;rightarrow &#92;Hat{T} ' title='f:(p_1,p_2,p_3) &#92;rightarrow &#92;Hat{T} ' class='latex' />.<br />
I would form a m by n matrix A, with n=3, with columns P1,P2,P3. Then form  a m by 1 row column vector T from the corresponding m known temperatures<br />
from the raw data table<br />
and solve <img src='http://s0.wp.com/latex.php?latex=Ax+%5CHat%7B%3D%7D+T+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='Ax &#92;Hat{=} T ' title='Ax &#92;Hat{=} T ' class='latex' /> in the least square sense. This means finding the weights vector  <img src='http://s0.wp.com/latex.php?latex=x%3Dx_1%2Cx_2%2Cx_3+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='x=x_1,x_2,x_3 ' title='x=x_1,x_2,x_3 ' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=Ax%3D%5CHat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='Ax=&#92;Hat{T} ' title='Ax=&#92;Hat{T} ' class='latex' /><br />
where <img src='http://s0.wp.com/latex.php?latex=%5CHat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Hat{T} ' title='&#92;Hat{T} ' class='latex' /> is the projection of T onto the column space of A.</p>
<p>With this function f, generated using rows with dates in the calibration period, to examine length of the error vector <img src='http://s0.wp.com/latex.php?latex=%5CHat%7BT%7D+-+T+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Hat{T} - T ' title='&#92;Hat{T} - T ' class='latex' /><br />
based on rows in the validation date range, to the  that based on rows in the calibration date range.<br />
Or we can  form the correlation coefficient <img src='http://s0.wp.com/latex.php?latex=%5Crho+%3D+%5Cfrac%7B%5Cmbox%7Bcov%7D%28T%2C%5CHat%7BT%7D%29%7D%7B%5Csqrt%7B+%5Cmbox%7Bvar%7D%7B%28T%29%7D%2C%5Cmbox%7Bvar%7D%7B%28%5Chat%7BT%7D%29%7D%7D%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;rho = &#92;frac{&#92;mbox{cov}(T,&#92;Hat{T})}{&#92;sqrt{ &#92;mbox{var}{(T)},&#92;mbox{var}{(&#92;hat{T})}}} ' title='&#92;rho = &#92;frac{&#92;mbox{cov}(T,&#92;Hat{T})}{&#92;sqrt{ &#92;mbox{var}{(T)},&#92;mbox{var}{(&#92;hat{T})}}} ' class='latex' /><br />
over the validation date range to check that <img src='http://s0.wp.com/latex.php?latex=%5CHat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Hat{T} ' title='&#92;Hat{T} ' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=T+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='T ' title='T ' class='latex' /> are correlated  when we move outside the calibration period<br />
to include validation period dates as well.</p>
<p>Using function f, we can get projected temperatures <img src='http://s0.wp.com/latex.php?latex=%5CHat%7BT%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Hat{T} ' title='&#92;Hat{T} ' class='latex' /> for any date in the raw table.<br />
From here draw the graphs of date by f(p1,p2,p3)=T for dates into the reconstruction period as well.<br />
I guess one can use moving averages etc when graphing but thats a detail.</p>
<p>Reconstruction completed.</p>
<p>To get x, one can use either the SVD factorization of <img src='http://s0.wp.com/latex.php?latex=A%3DU+%5CSigma+V%5ET+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A=U &#92;Sigma V^T ' title='A=U &#92;Sigma V^T ' class='latex' /> or use the older method of<br />
<img src='http://s0.wp.com/latex.php?latex=A%5ET+A+x+%3D+A%5E%7BT%7D+T++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A x = A^{T} T  ' title='A^T A x = A^{T} T  ' class='latex' />  which factors positive definite <img src='http://s0.wp.com/latex.php?latex=A%5ET+A++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A  ' title='A^T A  ' class='latex' /> to get<br />
<img src='http://s0.wp.com/latex.php?latex=%7B%28A%5ET+A%29%7D%5E%7B-1%7D+%3D+Q+%7B%5CLambda%7D%5E%7B-1%7D+Q%5ET+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='{(A^T A)}^{-1} = Q {&#92;Lambda}^{-1} Q^T ' title='{(A^T A)}^{-1} = Q {&#92;Lambda}^{-1} Q^T ' class='latex' />. Multiply both sides of <img src='http://s0.wp.com/latex.php?latex=A%5ET+A+x+%3D++A%5ET%5C+T++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A x =  A^T&#92; T  ' title='A^T A x =  A^T&#92; T  ' class='latex' /> by this to get weight vector x.</p>
<p>With <img src='http://s0.wp.com/latex.php?latex=A%5ET+A+%3D+Q+%5CLambda+Q%5ET++&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A = Q &#92;Lambda Q^T  ' title='A^T A = Q &#92;Lambda Q^T  ' class='latex' /> the eigenvectors are in the columns of Q and the eigenvalues in the diagonal <img src='http://s0.wp.com/latex.php?latex=%5CLambda+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Lambda ' title='&#92;Lambda ' class='latex' />,<br />
one might chop  some of the smaller eigenvalues  in <img src='http://s0.wp.com/latex.php?latex=%5CLambda+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Lambda ' title='&#92;Lambda ' class='latex' /> to zero, before forming  <img src='http://s0.wp.com/latex.php?latex=%7B%5CLambda%7D%5E%7B-1%7D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='{&#92;Lambda}^{-1} ' title='{&#92;Lambda}^{-1} ' class='latex' /> for more smoothness<br />
but that&#8217;s the extent or manual intervention (fiddling) with this approach.</p>
<p>My question is : Where does principal component analysis fit in at all? It seems like its beside the point. How does this help to get function f<br />
that takes us from a proxy row <img src='http://s0.wp.com/latex.php?latex=p_1%2Cp_2%2Cp_3+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='p_1,p_2,p_3 ' title='p_1,p_2,p_3 ' class='latex' /> to a temperature <img src='http://s0.wp.com/latex.php?latex=%5CHat+T+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='&#92;Hat T ' title='&#92;Hat T ' class='latex' />?</p>
<p>One could process A by taking each proxy column, and subtracting its column mean from each element and dividing by the column variance.<br />
to make <img src='http://s0.wp.com/latex.php?latex=A%5ET+A+%3D+R+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='A^T A = R ' title='A^T A = R ' class='latex' /> into a correlation matrix R, and try to find your factor maxrix F such that  <img src='http://s0.wp.com/latex.php?latex=R%3DF+F%5ET+%2B+D+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='R=F F^T + D ' title='R=F F^T + D ' class='latex' />, D a diagonal  matrix<br />
for unassigned variance.<br />
But we introduce the<br />
human element into a selection of how such a factorization is done to account for the all the correlations in R. In recasting R,<br />
(F might be a 3 by 2 matrix in the baby example we cook down the number of proxys to 2) we account for as much of R as we can with a<br />
reduced number of synthesized factors F. But to wnat end?</p>
<p>Please don&#8217;t tell me that they take the  <img src='http://s0.wp.com/latex.php?latex=F%5ET+F+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='F^T F ' title='F^T F ' class='latex' /> instead of A, and use the usual least squares method to construct the weight vector x.<br />
This can&#8217;t be what is done by MBH, is it?</p>
<p>Lastly, please explain what the 100 and 200 &#8230; year windows are.<br />
I would like to get the actual data and fiddle around with it myself.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: TCO</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52745</link>
		<dc:creator><![CDATA[TCO]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 18:31:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52745</guid>
		<description><![CDATA[It has problems with the less then sign.  Thinks it is a tag.]]></description>
		<content:encoded><![CDATA[<p>It has problems with the less then sign.  Thinks it is a tag.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lawrence Hickey</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52744</link>
		<dc:creator><![CDATA[Lawrence Hickey]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 18:21:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52744</guid>
		<description><![CDATA[I&#039;m beat. My question is now- what do I need to do to send a valid latex document and not have the site choke on it. Maybe its too long.]]></description>
		<content:encoded><![CDATA[<p>I&#8217;m beat. My question is now- what do I need to do to send a valid latex document and not have the site choke on it. Maybe its too long.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lawrence Hickey</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52743</link>
		<dc:creator><![CDATA[Lawrence Hickey]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 18:18:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52743</guid>
		<description><![CDATA[OK I give up. Here is the latex version- ugly as sin. I replaced the $ sign with the tags but wont work. Looks fine as a latex
document

\documentclass[12pt]{article}
\usepackage{latexsym}
\usepackage{amsmath}
\begin{document}

Dear Blog: \\
First: can you repeat where the MBH raw data is archived?

Next: \\
I have a fair grasp of linear algebra, but very limited knowledge of PCA, and through that lens,
I have this idealized- over simplified overview of the reconstruction problem. Help me with any misunderstandings.

Suppose we have three proxys (for simplicity) P1,P2,P3.
Suppose the raw data is a table with 5 column headings, with T being temperature.

[date, P1 ,P2 ,P3 ,T]

The different proxy columns may have different units, Tree ring width might be one example.

Sorting the table by date, most recent first, the temperatures are missing for rows   $ r &gt; r_v$. (the reconstruction period).
Take the first m rows, $1 \le r]]></description>
		<content:encoded><![CDATA[<p>OK I give up. Here is the latex version- ugly as sin. I replaced the $ sign with the tags but wont work. Looks fine as a latex<br />
document</p>
<p>\documentclass[12pt]{article}<br />
\usepackage{latexsym}<br />
\usepackage{amsmath}<br />
\begin{document}</p>
<p>Dear Blog: \\<br />
First: can you repeat where the MBH raw data is archived?</p>
<p>Next: \\<br />
I have a fair grasp of linear algebra, but very limited knowledge of PCA, and through that lens,<br />
I have this idealized- over simplified overview of the reconstruction problem. Help me with any misunderstandings.</p>
<p>Suppose we have three proxys (for simplicity) P1,P2,P3.<br />
Suppose the raw data is a table with 5 column headings, with T being temperature.</p>
<p>[date, P1 ,P2 ,P3 ,T]</p>
<p>The different proxy columns may have different units, Tree ring width might be one example.</p>
<p>Sorting the table by date, most recent first, the temperatures are missing for rows   $ r &gt; r_v$. (the reconstruction period).<br />
Take the first m rows, $1 \le r</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lawrence Hickey</title>
		<link>http://climateaudit.org/2006/06/13/mbh-and-partial-least-squares/#comment-52742</link>
		<dc:creator><![CDATA[Lawrence Hickey]]></dc:creator>
		<pubDate>Thu, 29 Jun 2006 18:14:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.climateaudit.org/?p=707#comment-52742</guid>
		<description><![CDATA[(try again) your tex macro didnt like some of my expressions:
First: can you repeat where the MBH raw data is archived?

Second:
I have a fair grasp of linear algebra, but very limited knowledge of PCA, and through that lens,
I have this idealized- over simplified overview of the reconstruction problem. Help me with any misunderstandings.

Suppose we have three proxys (for simplicity) P1,P2,P3.
Suppose the raw data is a table with 5 column headings, with T being temperature.

[date, P1 ,P2 ,P3 ,T]

The different proxy columns may have different units, Tree ring width might be one example.

Sorting the table by date, most recent first, the temperatures are missing for rows   $latex  r &gt; r_v $. (the reconstruction period).
Take the first m rows   as the calibration period. Assume we have temperature values for rows in the calibration period.
Rows  m]]></description>
		<content:encoded><![CDATA[<p>(try again) your tex macro didnt like some of my expressions:<br />
First: can you repeat where the MBH raw data is archived?</p>
<p>Second:<br />
I have a fair grasp of linear algebra, but very limited knowledge of PCA, and through that lens,<br />
I have this idealized- over simplified overview of the reconstruction problem. Help me with any misunderstandings.</p>
<p>Suppose we have three proxys (for simplicity) P1,P2,P3.<br />
Suppose the raw data is a table with 5 column headings, with T being temperature.</p>
<p>[date, P1 ,P2 ,P3 ,T]</p>
<p>The different proxy columns may have different units, Tree ring width might be one example.</p>
<p>Sorting the table by date, most recent first, the temperatures are missing for rows   <img src='http://s0.wp.com/latex.php?latex=r+%3E+r_v+&amp;bg=ffffff&amp;fg=000&amp;s=0' alt='r &gt; r_v ' title='r &gt; r_v ' class='latex' />. (the reconstruction period).<br />
Take the first m rows   as the calibration period. Assume we have temperature values for rows in the calibration period.<br />
Rows  m</p>
]]></content:encoded>
	</item>
</channel>
</rss>
