Re: RomanM (#80), You are indeed correct . . . and after I figured it out, I feel a bit stupid.

.

I had thought that such a correction improved the verification statistics. However, I mis-named the variable that I applied the correction to, so all of my test reconstructions were run with just the stations scaled by the eigenvector weights and no correction. When I figured that out and actually applied the correction, the verification statistics were degraded a bit.

.

And then, after thinking about it, I started wondering why I thought that would work in the first place.

.

Sometimes . . . not so smart is I.

.

I redid this explanation in Word and posed it as a question to Dr. Beckers. The .pdf – with a MUCH improved explanation of what I did – is here:

.

]]>

. ]]>

Re: Ryan O (#84), I give up. Eqn. 8 describes a matrix with the scaled station data times placed beside PC 1.

]]>Re: Ryan O (#83),

.

Stupid human.

.

Eqn. 8:

Re: Ryan O (#82),

.

Eqn. in #2 should read:

.

#4 should read: Place PC 1 ( ) next to the completely imputed station data.

.

Eqn. in #8 should read: ( )

.

Those “PCscaled” and “eigenscaled” notes should be part of the subscripts. They’re just descriptors.

Re: RomanM (#80), I don’t know whether this will change your opinion, but I think I need to get off my ass and write down the math for the latest algorithm. The algorithm no longer does a simple regression of station data and the PCs; it explicitly uses the eigenvector to weight the station data and performs the regression on the weighted matrix. The steps to get to this point are:

.

1. Perform RegEM on the station data without the PCs.

.

2. Find the PCs () for the AVHRR data:

.

.

3. Extract the spatial weights for each PC at the station locations. Let’s call this matrix

.

4. Place PC 1 next to the completely imputed station data.

.

The next few steps take place within my modified RegEM algorithm:

.

5. Scale all series to unit variance.

.

6. Scale the station data by the appropriate vector from and rescale such that the range of equals 1. This is for consistency – it maintains the same scaling of station data relative to PC regardless of which PC you are imputing.

.

.

7. Multiply by a weighting factor (default to 1). The reason for this is to provide a means for performing the reconstruction as an extrapolation of the PCs (small ) by de-emphasizing the station data, or as an interpolation (large ) by emphasizing the station data. How this works is clear from the next step.

.

8. In RegEM, we now have the following matrix:

.

.

We perform the SVD *directly* on this matrix. Because each series is scaled by the appropriate constant from , series where the PC explains the most variation will be selected as the low-order PCs, while series where the PC explains little variation will be relegated to the higher-order PCs. This ensures that the stations representing high-weight regions are represented in the SVD of the matrix. It also maximizes the contribution to the total error of the high-weight stations during the OLS regression, which causes the high-weight stations to drive the imputation.

.

Now the purpose of is apparent: small weights errors in the PC high, so minimizing imputation error in the PC minimizes the overall error. This results in an extrapolation of the PCs. High weights errors in the stations high, so minimizing imputation error in the stations minimizes the overall error. The latter forces the PC to the station solution. This results in anchoring the original eigenvector at the station values during the reconstruction, and using the coefficients to predict temperature at locations where the stations are not present – in other words, becomes analogous to the kriging estimator for kriging.

.

9. Once the algorithm converges, we extract the imputed PC, .

.

10. At this point, we can recover the temperature contribution from the imputed PC 1:

.

.

The reason I felt it was necessary to remove the contribution from PC 1 before calculating PC 2 is because of the step. If you did not do this step, then I agree that removing the contribution is unnecessary. However, weighting the stations changes the regression results. As each PC has its own set of station weights in – and these station weights were derived by explaining *remaining* variation after the prior PCs had been calculated – then for each subsequent regression to be valid, you need to remove the contribution of the prior PCs. Failure to do so results in convergence to a different (albeit similar) answer.

.

When I do this by removing the contribution of prior PCs, the net effect is to give a reconstruction solution that takes higher regpars to achieve if the contribution is not removed. I’m not totally sure I’m on the right track, though.

.

Re: Steve McIntyre (#81), This latest algorithm does explicitly use the geographic information. Also, it’s faster than the previous ones – so I’ve done test reconstructions using 100+ PCs. We could easily do reconstructions using ALL of the PCs, which makes determining the truncation point simple. After ~13 it doesn’t matter. You don’t have to worry about including too many – because you *can’t* include too many.

.

Also, while I am interested in the average, I am also interested in the geographic distribution of the trends since it was the odd geographical distribution that made the Steig paper so noteworthy. The last 7 methods (or thereabouts) Jeff Id and I have done all show the same geographic distribution. The only outlier is Steig’s.

.

*presses submit and prays the Latex worked…*

On the other hand, area-averaging methods such as ore reserves explicitly use the geometry.

Here are a couple of thoughts.

I think that we can say that there is very strong evidence for spatial autocorrelation with more or less negative exponential decorrelation. (Far more evidence for this than Steig’s 3 **physical** modes). This should enable some analytic simplifications that simply cut through the PC problem altogether. I don’t think that there’s any evidence of real modes. And I don’t see any purpose in truncating things merely for computational reasons without trying to think through the math. Maybe there’s an analytical sum of series somewhere.

Second, our interest at any given time is not really in the infilled values at particular stations but in the Antarctic continental average at any given time given spotty measurements. On a spatial autocorrelation basis, this is an inverse of the Fourier heat equation (which also makes sense from first principles) – which is going to be approximated by the disk case where you have a value at the center and some values on the boundary. My guess is that the analytic solution to this sort of problem is a form of area-averaging. These things usually end up with a geometric connection.

]]>Re: Ryan O (#79),

Ryan, it’s not clear to me that, under the circumstances, any “correction” is necessary.

I assume that what you are talking about here is as follows:

— The station data is infilled for the entire period first (either by itself or by using the satellite PCS to infill the post 1982 period and then extended to the earlier times).

— This infilled station data is used to extend the PCs.

If this is the case, then the station data is being used merely to predict the PCs and predicting the first PC does not “use up” information which is no longer available to predict the others. In a simple regression situation, if I am predicting two variables using a fixed set of predictors, the least squares solution for the simultaneous prediction is numerically the same as that of the separate individual regressions. Of course, this latter fact is not true for total least squares regression, but the underlying concepts appear to be the same.

Where your notion of adjusting the predictors comes from is the decomposition of the variability of the station data in which quite correctly, if a PC summarizes some of the variability of the station temperature structure, it should be removed from the data before the next PC is calculated. That is not quite the situation you are looking at.

You could also try the following sequential approach as a sort of in-between approach: Take the station data and impute the first PC, Then attach that to the station data matrix and use the new structure to impute the second PC (keeping the firsr PC imputuation fixed), etc. This way, previous PCs have an effect on the later imputations (but not vice versa).

If I missed a step on what you are doing, please correct me and I will comment further.

Good stuff on getting a better handle on a not-so-transparent technique.

]]>