Steig Eigenvectors and Chladni Patterns

Last year, I did a few posts connecting spatial autocorrelation to something as mundane as the Stahle/SWM tree ring network. In the process, I observed something that I found quite interesting – that principal components applied to geometric shapes with spatially autocorrelated series generated Chladni patterns, familiar from violins and sounds. The Antarctica vortex represents an interesting example of another fairly constrained geometric shape. I’ve alluded a few times to the similarity of the supposedly “physical” Steig eigenvectors to Chladni patterns and today I’ll show why I made this comparison. In the figure below, I show left – the first 3 Steig eigenvectors and right – the first 3 eigenvectors from spatially autocorrelated sites arranged on a disk.

Both the similarities and differences are worthy of notice.

The first eigenvector of an isolated disk weights the interior points more heavily than points around the circumference – a pattern observable in the Steig eigenvector as well. The first Steig eigenvector is displaced somewhat to the east – but you’ll notice that Antarctica is by no means perfectly circular and the “center of gravity” is displaced to the east as well. My guess is that the same sort of graphic done on the actual Antarctic shape will displace to the east as well. I’ll check that some time.

The 2nd and 3rd Steig eigenvectors also have points in common and points of difference with the simple circular case. Both Steig eigenvectors are “two lobed”. Because the eigenvalues of the first two circular eigenvectors are identical, these two eigenvectors (EOFs are the same thing) are not “separable” in the sense of North et al 1982. Any linear combination is as likely as any other one. So in that sense, for a disk, there is no preferred axis. In the Antarctic case, the Transantarctic Mts look like they provide a reason for orienting the second Steig eigenvector. It would be an interesting exercise to evaluate just how much extra oomph there is in this symmetry breaking. The third Steig eigenvector is not as cleanly perpendicular to the 2nd eigenvector as in the disk example, but I think that one can readily discern this structure.

   
   
   

In the circular case, it’s not as though eigenvectors of lower order are uninterpretable. I’ve posted up a pdf showing the plots for the first 16 eigenvectors, which interesting and pretty patterns (thanks to Roman for showing how to make pdfs straight from R). In the circular case, you can get radially symmetric eigenvectors – a structure that seems like it might be highly relevant if one wants to incorporate seemingly negatively correlated data from islands in the Southern Ocean. I’m not saying that this is a relevant structure, only that this eigenvector might be empirically relevant – or it might not be.

As to how to summarize or truncate – I have no prescriptions. There are lots of schemes – Preisendorfer’s Rule N has been mentioned. But I haven’t seen any head-to-head analyses of Preisendorfer’s Rule N against eigenvectors generated from spatially autocorrelated data – so I don’t know whether it’s a good rule in these circunstances or not.

Let’s return to the description of these three eigenvectors in Steig:

The first three principal components are statistically separable and can be meaningfully related to important dynamical features of high-latitude Southern Hemisphere atmospheric circulation, as defined independently by extrapolar instrumental data. The first principal component is significantly correlated with the SAM index (the first principal component of sea-level-pressure or 500-hPa geopotential heights for 20S–90S), and the second principal component reflects the zonal wave-3 pattern, which contributes to the Antarctic dipole pattern of sea-ice anomalies in the Ross Sea and Weddell Sea sectors 4,8

No proof or evidence was offered for any of these assertions. You’d think that Nature would require authors to provide evidence, but seemingly not in this case. The observation about the second eigenvector seems untrue on its face: the reported eigenvector has 2 lobes, not 3, which, in my eyes, disqualifies this interpretation.

The plots here provide more evidence (though not “proof”) that the eigenvectors are simply what you’d expect from applying principal components to spatially autocorrelated data on a not quite circular shape than Steig offered up in support of his assertion that these things are “meaningfully related to important dynamical features of high-latitude Southern Hemisphere atmospheric circulation, as defined independently by extrapolar instrumental data.”

Why did Steig use a cut-off parameter of k=3?

A question that Jean S inquired about before we were so rudely interrupted. The expanation in Steig et al was:

Principal component analysis of the weather station data produces results similar to those of the satellite data analysis, yielding three separable principal components. We therefore used the RegEM algorithm with a cut-off parameter k=3…. A disadvantage of excluding higher-order terms (k>3) is that this fails to fully capture the variance in the Antarctic Peninsula region. We accept this tradeoff because the Peninsula is already the best-observed region of the Antarctic.

I’ve sent an inquiry to one of the coauthors on what they mean by “separable principal components” and how they determined that there were three, as opposed to some other number. I would have thought that “significant” would be a more relevant term than “separable”, but we’ll see.

In the figure below, I show AWS trends for regpar =1,2,3,4,5,6,16, 32. It turns out that 3 was very fortuitous choice as this proved to yield the maximum AWS trend, something that will, I’m sure, astonish most CA readers. For regpar=1, the trend was negative. (I can picture a line of argument as to why 1 is a better choice than 3 or 4, which I’ll visit on another day.) As k increased the trend returned towards 0. Thus k, selected to be 3 no doubt from the purest of motives, yielded the maximum trend. I guess that was a small consolation for the bitter disappointment of failing to “fully capture the variance in the Antarctic Peninsula region” and it was definitely gracious of Steig and Mann to acquiesce in this selection under the circumstances.

ASW Trends under different regpar parameters (“RegEM Truncated PC”)

The graphic shows results for a method slightly varied from RegEM TTLS – let’s call it RegEM Truncated PC. I’ll explain the differences tomorrow. RegEM TTLS is a pig as regpar increases. RegEM TTLS yields rank k reconstructions; “RegEM Truncated PC” also yields rank k reconstructions, that were only about 1% different in benchmarks. For 1-6, RegEM TTLS has a similar pattern, but so far we haven’t run RegEM TTLS with higher regpar values as it will be VERY slow. (Jeff Id is going to try.)

I’ve got a bit of a personal interest as to why they excluded the PC4. Seems like something we’ve visited before, doesn’t it.

Feb 24 Update: Jeff Id reports:

I ran reg EM up to regpar=14, after that the methods it uses to set up TTLS don’t work and the regpar is truncated back down. Speed wasn’t too bad, each iteration in the loop takes about 3 seconds but the trend didn’t converge easily. After 100 iterations the high order system didn’t converge but was creeping closer. At regpar=8 it did converge after about 65 iterations.

We're Back!

In case you’re wondering where we’ve been, the story is at Anthony Watts’ site. Anthony reports the resuscitation here. Thanks, Anthony. Will chat tomorrow.

Antarctic Spatial Autocorrelation #1

“Noisy” covariance matrices have been discussed here on many occasions in a variety of contexts, largely because the underlying strategy of Mannian methods is to calculate the covariance of everything to everything else and then calculate verification stats using methods that ignore the data mining that effectively takes place with huge covariance matrices. Steig et al 2009 is no exception.

The “justification” in Steig is as follows:

In essence, we use the spatial covariance structure of the surface temperature field to guide interpolation of the sparse but reliable 50-year-long records of 2-m temperature from occupied weather stations. Although it has been suggested that such interpolation is unreliable owing to the distances involved, [1. Turner, J. et al. Antarctic climate change during the last 50 years. Int. J. Climatol. 25, 279–294 (2005)] large spatial scales are not inherently problematic if there is high spatial coherence, as is the case in continental Antarctica [Schneider, D. P., Steig, E. J. & Comiso, J. Recent climate variability in Antarctica from satellite-derived temperature data. J. Clim. 17, 1569–1583 (2004)]

Which raises the obvious question: is there “high spatial coherence”? The citation for this (Schneider et al 2004) is unsurprisingly self-referential. Although the “active ingredient” in the pre-1980 reconstruction is the surface station record, Schneider et al 2004 provides no evidence for “high spatial coherence” in these records; instead, it discusses the AVHRR records.

I thought that it would be interesting to do a scatter plot of the distance between surface stations (here only the surface stations, not the AWS stations) against the inter-station correlation – the sort of low brow analysis that the Team abhors. Here is a location map of the stations. There are several Southern Ocean stations (e.g. Campbell Island, Macquarrie Island near NZ and Grytviken near Argentina), that are included in the data set.

Here is a scatter plot of correlations to distance, evidencing a strong decay with distance. There is a highly significant fit to a simple relationship based on exponential decay with distance (r^2 of 0.76 and t =71. (At this time, I’m not trying to assert that this is the form of a “true” relationship, only that it is very evident that there is a strong spatial decorrelation in the first 2000 km.)

fm=lm(cor~ I( exp(-dist/1000)-1) ,data=station);

Here is a histogram of correlations for distances above 2500 km, showing that they are distributed on both sides of zero – on balance slightly negative. It seems to me that it would be uphill to demonstrate that high correlations in this histogram are “significant” “teleconnections”, as opposed to the sorts of correlation turned up in a population of nearly 1000 autocorrelated series.

Above 2500 km, the covariance between surface stations all just looks like noise to me. The above decay of correlation with distance strongly argues in favor of the “null” model being stations with spatial autocorrelation decaying to 0 by 2500 km or so.

Note: In comments below, readers have argued that this is too negative a characterization, Ryan pointing to the clump of positive correlations around 6000 km. In light of these comments, I re-examined the data set and found that the clump of positive long distance correlations involve Campbell and Macquarrie Islands to Grytvyken and the Antarctic Peninsula stations; correlations to the Antarctic continent are slightly negative. For now, I’m just struck at the obvious decorrelation with distance. So far I’ve not carried out statistical analyses against a null model. You’d think that Steig et al would have done this sort of thing.

UPDATE Feb 24: Here’s a variation of the above graphic limiting stations to “continental” Antarctica. First here are the stations used:

Location map of surface stations used in correlation-distance plot below. 17 stations are excluded, of which 5 are islands in the Southern Ocean and 12 are in the northern part of the Peninsula.

Now here is the same scatter plot for correlation vs distance, which perhaps shows a more coherent decorrelation than with the island and far peninsula stations included.

Scatter Plot of Correlation vs Distance, 25 Stations.

Re-visiting the AWS Recon PCs

Yesterday, we discussed the remarkable decomposition of the AVHRR reconstruction into 3 PCs, initially observed by Roman and discussed yesterday by Jeff Id. I thought that it would be interesting to see what happened with the PC3 in the AWS reconstruction (and in the process, do some comparisons of the RegEM emulation (file-compatible between Jeff Id and myself) to the archived Steig AWS reconstruction. Again, some interesting results.

First, we again see a loading onto the first three PCs (high values for first three eigenvalues), but the loading is not nearly as concentrated as the AVHRR case. Right now I presume that this is due to the huge number of AVHRR columns (5509 versus 63 AWS) relative to 42 surface columns.

svd.steig=svd(recon_aws)
par(mar=c(3,3,2,1))
barplot(svd.steig$d,col=”grey80″,ylim=c(0,360));box()
title(“Recon AWS Eigenvalues”)

I then plotted out the first 3 PCs fully expecting to see something like the AVHRR PC3 in which the pre-1980 amplitude was negligible (which I believe to be due to the decision to infill with zeros though I’m not 100% sure of this.) To my surprise, there was no such pattern as shown below.

plot.ts(ts(svd.steig$u[,1:3],start=c(1957,1),freq=12),main=”Recon AWS PC1-3″)

Being cautious about these things, I plotted up the PC4-6 for good measure. (After all, didn’t Mann rely on a PC4 somewhere?) Here they are:

plot.ts(ts(svd.steig$u[,4:6],start=c(1957,1),freq=12),main=”Recon AWS PC1-3″)

As with the AVHRR reconstruction PC3, the AWS PC4-6 – and each has a noticeable eigenvalue – have the same pattern of extreme attenuation in the portion before measurements actually started. In my opinion, these patterns must surely relate to the initial decision to infill missing values with zero.

Schneider said in a comment on another thread that infilling with zero wouldn’t matter, because of the properties of the EM algorithm. Perhaps so, but based on what we’re seeing, it sure doesn’t look like this is necessarily the case for the RegEM PTTLS version used by Steig. So I’d definitely like to understand the basis of Schneider’s assertion better than I do right now and see if there’s something in the Steig PTTLS variation that upsets this property. Now we can’t directly check the Steig variation as he’s refused to disclose his code as used, but I can check with Schneider’s methodology and I’ll try to get to that in a day or two.

McCullough and McKitrick on Due Diligence

Bruce McCullough and Ross McKitrick today published an interesting article under the auspices of the Fraser Institute entitled Check the Numbers: The Case for Due Diligence in Policy Formation.

Their abstract states:

Empirical research in academic journals is often cited as the basis for public policy decisions, in part because people think that the journals have checked the accuracy of the research. Yet such work is rarely subjected to independent checks for accuracy during the peer review process, and the data and computational methods are so seldom disclosed that post-publication verification is equally rare. This study argues that researchers and journals have allowed habits of secrecy to persist that severely inhibit independent replication. Non-disclosure of essential research materials may have deleterious scientific consequences, but our concern herein is something different: the possible negative effects on public policy formation. When a piece of academic research takes on a public role, such as becoming the basis for public policy decisions, practices that obstruct independent replication, such as refusal to disclose data, or the concealment of details about computational methods, prevent the proper functioning of the scientific process and can lead to poor public decision making. This study shows that such practices are surprisingly common, and that researchers, users of research, and the public need to consider ways to address the situation. We offer suggestions that journals, funding agencies, and policy makers can implement to improve the transparency of the publication process and enhance the replicability of the research that is published.

They canvass an interesting selection of cases from different fields (and I alert readers that I don’t have the faintest interest in debating the pros or cons of the issues in these other studies at this blog and do not wish readers to debate these issues here.) They report quantitative results from McCullough’s replication work in economics, but most readers will probably take the most interest from their accounts of several high profile studies – the Boston Fed study, the Bellesiles affair and the Hockey Stick.

The “Boston Fed” study was apparently related to policy changes on subprime mortgages. (I don’t want people to debate the rights or wrong of the policy) only the replicability issue. It appears that data requests were refused and, reminiscent of our dealings with Santer, Jones, etc, authors interested in testing the results finally resorted to FOI requests. (Climate scientists resent this, but there are precedents.) MM report the denouement as follows:

Day and Liebowitz (1998) filed a Freedom of Information Act request to obtain identifiers for these observations so they could re-run the analysis without them. They also noted that the Boston Fed authors (Munnell et al., 1992) did not use the applicant’s credit score as generated by the bank, but had replaced it with three alternate indicators they themselves constructed, which Day and Liebowitz found had omitted many standard indicators of creditworthiness. Day and Liebowitz showed that simply reverting to the bank’s own credit score and correcting the 26 misclassified observations caused the discrimination coefficient to drop to zero.

Harrison (1998) noted that the Boston Fed data set included many more variables than the authors had actually used. These included measures such as marital status, age, and whether the application contained information the bank was unable to verify. These variables were significant when added back in, and their inclusion caused the discrimination effects to drop to zero even without correcting the data errors noted by Day and Liebowitz.

Thus, the original Boston Fed conclusions were eventually shown to be wholly insupportable. But due to various delays these studies were not published until 1998 in Economic Inquiry, six years after the original study’s release …

The Bellesiles story is also very interesting for blog readers. Clayton Cramer, Bellesiles’ nemesis was a software engineer – he profiles very much like a typical Climate Audit reader. Cramer eventually published in the journal, Shotgun News, which, according to recent statistics, has an impact factor lower than either Science or Nature.

Despite the political importance of the topic, professional historians did not actively scrutinize Bellesiles’ thesis. Instead it was non-historians who began the process of due diligence. Stephen Halbrook, a lawyer, checked the probate records for Thomas Jefferson’s three estates (Halbrook, 2000). He found no record of any firearm, despite the fact that Jefferson is known to have been a lifelong owner of firearms, putting into question the usefulness of probate records for the purpose. Soon after, a software engineer named Clayton Cramer began checking Bellesiles’ sources. Cramer, who has a master’s degree in history, found dates changed and quotations substantively altered. However, Cramer was unable to get academic journals to publish his findings. Instead he began sending articles to magazines such as the National Review Online and Shotgun News. He compiled an extensive list of errors, numbering in the hundreds, and went so far as to scan original documents and post them on his website so historians would check the original documents against the text of Bellesiles’ book (Cramer, 2006)

Bellesiles claimed to have examined hundreds of San Francisco probate records from the 1850s. When confronted with the fact that all the San Francisco probate records had been destroyed in the 1906 earthquake, Bellesiles claimed that he obtained them from the Contra Costa County Historical Society. But the Society stated that it did not possess the requisite records. Bellesiles soon resorted to ad hominem, claiming that the amateur critics could not be trusted because they lack credentials. Referring to Clayton Cramer, Bellesiles said, “It is not my intention to give an introductory history lesson, but as a non-historian, Mr. Cramer may not appreciate that historians do not just chronicle the past, but attempt to analyze events and ideas while providing contexts for documents” (Bellesiles, 2001). Note that Bellesiles could have, at any time, ended the controversy by simply supplying his data to his critics, something he refused to do.

Ultimately Bellesiles appears to have been brought down by the black and white fact that it was impossible for him to have consulted the records, said to have been consulted, because they didn’t exist. Anyone remember the claims in Jones et al 1990 to have consulted Chinese station histories that don’t exist, and the absurd claims of the coauthors to have lost the records that supposedly had been faithfully preserved through World War II, the Cultural Revolution… but it’s climate and Doug Keenan’s effort to pursue the matter got nowhere.

I raised one beef today with coauthor McK. The term “due diligence” is used to frame the discussion – as it usefully puts “(journal) peer review” in a more general context. However, Mc and Mc do not identify the first academic article to use this term in this context (tho the article was cited in passing on another matter.) The first such usage, to my knowledge, was, of course, McIntyre and McKitrick (2005 EE), which ended as follows:

We are also struck by the extremely limited extent of due diligence involved in peer review as carried out by paleoclimate journals, as compared with the level of due diligence involved in auditing financial statements or carrying out a feasibility study in mineral development. For example, “peer review” in even the most eminent paleoclimate publications, as presently practiced, does not typically involve any examination of data, replication of calculations or ensuring that data and computational procedures are archived. We are not suggesting peer reviewers should be auditors. Referees are not compensated for their efforts and journals would not be able to get unpaid peer reviewers to carry out thorough audits. We ourselves do not have explicit recommendations on resolving this problem, although ensuring the archiving of code and data as used is an obvious and inexpensive way of mitigating the problem.

But it seems self-evident to us that, recognizing the limited due diligence of paleoclimate journal peer review, it would have been prudent for someone to have actually checked MBH98 data and methods against original data before adopting MBH98 results in the main IPCC promotional graphics.

The issues raised in McCullough and McKitrick are important ones and presented in an engaging fashion (though I’m obviously a fellow traveller on these matters.) Ross was on one radio show already and, like any member of the public (as I once was ), the host was dumbfounded at the lack of due diligence in the chain.

A simple and virtually zero-cost improvement in the system would be one that we’ve long supported: require the archiving of data and code. The purpose of this requirement has been totally misrepresented by Gavin Schmidt – it’s not to parse for code errors, but to put yourself in a position where you can quickly analyse sensitivities or the impact of new data, without having to run the gauntlet of doing everything from scratch.

As I’ve said on numerous occasions, I do not think that the issue is primarily inadequate peer review, though, in my opinion, journal peer review all too easily lapses into POV gatekeeping of the style of Burger and Cubasch Referee #2 and academicians are far too quick to shrug that off. Journal peer review is what it is – a cursory form of due diligence. The issue is that “buyers” assume that it’s something that it isn’t and fail to exercise caveat emptor .

Reblog this post [with Zemanta]

The Three AVHRR PCs

Roman Mureika perceptively observed about 10 days ago that the AWS reconstruction consisted of only 3 PCs. Jeff Id has extended this to observing that the AVHRR reconstruction consists of only 3 PCs. Jeff’s demonstration of this is correct, but a little awkward. I’ll show an alternate demonstration which shows a useful linear algebra property of singular value (eigenvalue) decompositions: the SVD of the transpose of a matrix is the same as the SVD of a matrix. If Jeff had simply transposed the short fat matrix to a long thin matrix, his algebra would have been simplified. I’ve also extended his analysis to plot the 3 eigenvectors (corresponding to the 3 PCs) on a geographic grid. Steig has archived something that’s been fitted. There is no possibility that the AVHRR archive represents original data.

First download the AVHRR recon data:

download.file(“http://faculty.washington.edu/steig/nature09data/ant_recon.txt”,”temp.dat”,mode=”wb”)
grid=scan(“temp.dat”,n= -1) # 3305400
#600*5509 #[1] 3305400
recon=ts(t(array(grid,dim=c(5509,600))),start=c(1957,1),freq=12)
dimnames(recon)[[2]]=1:5509

Now download the AVHRR grid info (changing to -180/180 long for plotting):

grid.info=1:5509;grid.info=data.frame(grid.info);names(grid.info)=”id”
grid.info$lat=scan(“http://faculty.washington.edu/steig/nature09data/data/Tir_lats.txt”)
grid.info$long=scan(“http://faculty.washington.edu/steig/nature09data/data/Tir_lons.txt”)
temp=grid.info$long>180
grid.info$long[temp]= -( 360- grid.info$long[temp])

Now transpose and do the singular value decomposition:

X=t(recon);dim(X); rm(recon) #5509 600
svd0=svd(X)

Next plot the eigenvalues. This proves that only 3 PCs contribute to the recon.

plot(svd0$d,xlim=c(0,100),main=”Steig AVHRR Eigenvalues”)

This yields the following plot:

For good order’s sake, the first six eigenvalues are:

round(svd0$d[1:6],7)
#[1] 2272.6863885 993.2727255 708.5284449 0.0000024 0.0000024 0.0000024

If you’re wondering whether there might be some variation in this data over and above the three PC fit (notwithstanding the eigenvalues – say, you don’t exactly know what an eigenvalue is). Here’s a further simple proof that the AVHRR is fitted. Expand the 3 PC- 3 eigenvectors and compare the fit. The difference is negligible. As the others have noted, Steig’s AVHRR archive is a fitted product. There is NO possibility that this is original data.

A= svd0$u[,1:3]%*% diag(svd0$d[1:3]) %*% t(svd0$v[,1:3])
range(A-X)
#[1] -6.12266e-08 6.07958e-08

A plot of the 3 PCs (identical to Jeff Id’s plot) can be produced as follows:

plot.ts(svd0$v[,1:3],main=”Stieg AVHRR PCs”)

This yields the following diagram – with the interesting change in amplitude of the PC3 in the early 1980s – about the time when AVHRR measurements actually began to exist.

Jeff did not plot the three corresponding eigenvectors. This is a bit trickier, but we have the grid info on the location of the AVHRR gridcells and, by color coding the eigenvector values, these can be illustrated in the same way as trend maps. The maps below do NOT show trends. They show the weights in each of the PCs. The following code produces the three maps. Using the method here, I’ve plotted each pixel as a round point. You can control the point size through cex; in each map using this method, I usually have to guess at the size and experiment a little to get a map that looks OK. (If nothing else, this shows that it doesn’t take much code in modern languages to produce pretty graphics.)

#library(GDD)
library(mapproj)
col1=c(“#0000BB”,”#0014FF”,”#006BFF”,”#00C2FF”,”lightblue2″,”lightblue1″,”grey85″,
“lightgoldenrodyellow”, “lightgoldenrod”,”#FF8700″, “#FF5B00″,”#D70000″,”#800000”)
K=length(col1)
M=nrow(grid.info)
for (eigen in 1:3) {
grid.info$eof=svd0$u[,eigen];
ylim0=range(grid.info$eof,na.rm=T);ylim0 # 0.007698207 0.200473573
y0=ceiling ( max(abs(ylim0))/.005) *.005;y0 #.025
breaks0=seq(-y0,y0,2*y0/K); (N=length(breaks0))
h=function(x) { if (is.na(x)) h=NA else {
if(x< =max(breaks0)) h= min( (1:length(breaks0)) [ x<breaks0 ]) -1 else h= length(breaks0);
h[h==0]=1}
h}
grid.info$class=rep(NA,M); for(i in 1:M) grid.info$class[i]=h(grid.info$eof[i])

##PLOT
# GDD(file=paste("d:/climate/images/2009/steig/avhdd.eof",eigen,".gif",sep=""),type="gif",h=420,w=420)
par(mar=c(3,3,2,1))
map('world',proj='azequidistant',orient=c(-90,0,0),ylim=c(-90,-60),fill=TRUE,col="white") #irrelevant
a=seq(-180,150,30); for(i in 1:length(a)) lines(mapproject(list(x=rep(a[i],181),y= c(-90:90) )),col=4,lty=2)
a=c(-180,0); for(i in 1:length(a)) lines(mapproject(list(x=rep(a[i],181),y= c(-90:90) )),col=4,lty=1)
a=seq(-50,-80,-10);
for(i in 1:4) lines(mapproject(list(y= rep(a[i],72),x= seq(-177.5,177.5,5) )),col=4,lty=2 )
points( mapproject(list(y= grid.info$lat,x= grid.info$long) ),pch=19,col=col1[grid.info$class],cex=.5 )
title(paste("Steig AVHRR Eigenvector ",eigen) )
# dev.off()

Here are the three eigenvectors:


It is possible that this archive is derived from RegEM PTTLS, but, if so, it’s done in a different way than what we’ve seen for the AWS data by parsing Tapio Schneider Matlab code. In that code, real data where available is spliced with the fitted data. In the AVHRR case, it doesn’t look like any real data is spliced back in – if there was, you wouldn’t get such a perfect fit.

There’s another potentially interesting phenomenon here – in two cases, when RegEM is applied to truncated total least squares in which the truncation parameter is set at 3, the fitted output is made of 3 PCs. Is there a connection between the truncation parameter and the number of PCs required to decompose the solution? If so, is the RegEM methodology simply a long-winded way of converging to something that can be calculated directly? Sort of like a slow-converging infinite series where the answer can be calculated directly. Just a thought.

RegEM PTTLS Ported to R

I’ve now ported my emulation of Schneider’s RegEM PTTLS to R and benchmarked it against Jeff’s Matlab as shown below. I caution readers that this is just an algorithm. There are other ways of doing regressions and infills. The apparent convergence to three PCs noted by Roman is still pending as a highly interesting phenomenon. There are other RegEM flavors in Schneider’s algorithm, but I haven’t bothered trying to emulate these as it’s the PTTLS version that seems to be in play right now.

The method below retains coefficients from the final iteration, which I plan to use for further analysis. Here’s a short benchmark of the regem_pttls function.

First load functions

source(“http://www.climateaudit.info/scripts/regem/regem.functions.txt&#8221;)

Then load the data (also with a collation of Matlab intermediates)

#this uses BAS scrape for surface and reverse-engineered Steig recon_aws for AWS. Two surface series need to be deducted (Marion, Gough)
#version here is not quite file compatible with Steig. See Nic L on Dec 2002 problem
download.file(“http://www.climateaudit.info/data/steig/Data.tab&#8221;,”Data.tab”,mode=”wb”)
load(“Data.tab”)
download.file(“http://www.climateaudit.info/data/steig/Info.tab&#8221;,”Info.tab”,mode=”wb”)
load(“Info.tab”)
download.file(“http://www.climateaudit.info/data/steig/recon_aws.tab&#8221;,”recon_aws.tab”,mode=”wb”)
load(“recon_aws.tab”)
dimnames(recon_aws)[[2]]=Info$aws_grid$id
download.file(“http://www.climateaudit.info/data/steig/jeff.tab&#8221;,”temp.dat”,mode=”wb”);load(“temp.dat”) #matlab compilation

#surface
surf=Data$surface
surf=window(surf,start=1957,end=c(2006,12)) #ends in 2006 per Steig
surf = surf[,-26] #deletes column 26 – Marion
surf = surf[,-17] #deletes column 17 – Gough
dim(surf) #600 42
sanom=as.matrix(surf)
for (i in 1:42) sanom[,i]=anom(surf[,i],reference=1957:2007)
#Steig probably used different reference period and this can be experimented with later
(ny=ncol(sanom)) #42

#AWS reverse engineered (rather than READER scrape)
anoms =window(recon_aws, start = 1980);dim(anoms) #324 63
dat_aws = window(Data$aws[,colnames(recon_aws)],start=1980,end=c(2006,12));
dim(dat_aws) #324 63
anoms[is.na(dat_aws)] =NA
#apply(anoms,2,mean,na.rm=T) #some are zero, some aren’t
(nx=ncol(anoms)) #63

# combine anomalies
anomalies=ts.union(anoms,sanom)
dimnames(anomalies)[[2]]=c( dimnames(anoms)[[2]],dimnames(sanom)[[2]])
dim(anomalies) #600 105

Now do the iteration through the function below for 5 iterations (it takes a while (: )

regemtest=regem_pttls(X=jeff$X,maxiter=5,regpar=3)

Now do a quick comparison to Matlab results:

test=unlist(sapply(regemtest,function (A) A$dXmis) )
data.frame(test=test,jeff=jeff$dXmis[1:length(test),3])

Match perfectly. Wasn’t that easy.

# test jeff
#1 1.25811556 1.25800
#2 0.46735524 0.46740
#3 0.31128378 0.31130
#4 0.18168994 0.18170
#5 0.11207077 0.11210
#6 0.08017931 0.08018

To do a run on (say) surface stations by themselves with regpar=2, just do this

regem.surface=regem_pttls(X=jeff$X[,64:105],maxiter=5,regpar=2)

To get station-by-station RegEM trends (whatever they may prove to be):

trend=function(x) lm(x~ I((1:length(x))/12))$coef[2]
trend.ant= apply(regem.surface[[21]]$X,2,trend)

Porting RegEM to R #1

I’ve transliterated relevant Tapio Schneider code into R (pttls.m) and parts of regem.m that seem relevant at present. Jeff Id has extracted a variety of intermediates from his Matlab run and I’ve fully reconciled through two steps with remaining differences appearing to be probably due to transmission rounding. My dXmis statistic at step one was 1.258116 (Jeff – 1.258) and at step two was 0.4673552 (Jeff – 0.4674). I’ve posted a detailed turnkey script for the reconciliationhere (something that I do from time to time to make sure that I don’t inadvertently modify a reconciliation – scripts which are unexciting but time consuming.)

I’ve added in a few features that I’ll use for analysis purposes. RegEM calculates a bazillion coefficients – well, maybe not quite a bazillion – in the case at hand, 1,341,662 coefficients per iteration. Given that there are only 23,449 actual measurements between the AWS and surface stations in the 1957-2006 period, this is a yield of over 57 coefficients per measurement per iteration, which seems like a high yield and one that might give rise to simplification.

In particular, given that the “money” series is the reconstruction trend, this means that the station and AWS reconstructions end up being combined some way. By doing the algebra in a little different order (merely the commutative property of matrix algebra), we should be able to get to weighting factors using this information. But that’s a little distance in the future.

Today, I want to show a couple of graphics from the first two steps, in which I calculated the proportion of positive coefficients to total coefficients by year. Given that we’re dealing with millions of coefficients, these coefficients have to be considered statistically – something that you see in random matrix theory, which we’ve alluded to from time to time. Here they are. The proportions are not identical (I checked), but there isn’t much change in the first step.


What does this mean? Dunno but, intuitively, there are a couple of things that I don’t like.

First, I don’t like the idea that 1 out of 3 coefficients is negative. When things are eventually recombined into an Antarctic average, these negative coefficients will express themselves as upside down contributions from some stations – which surely can’t be a good thing.

Second, I don’t like the non-stationarity in the proportion of positive coefficients. Why does the proportion of positive coefficients decrease so sharply from the early 1980s to 2000? Does it “matter”? Dunno. This is the sort of thing that IMO makes it very undesirable to use poorly understood methods (such as TTLS RegEM with RegPar=3) for important applied results, particularly when the new results seem to yield “better” answers than earlier results with better understood methods (GISTEMP, Monaghan) and the differences are not clearly reconciled.

Anyway, I should have an end-to-end port of this method to R later today or tomorrow.

Interaction of Infilling on Std Deviation

Standardization in Mannian algorithms is always a bit of an adventure. The bias towards bristlecones and HS-shaped series from the impact of Mann’s short segment standardization on his tree ring PCs has been widely publicized. Smerdon’s demonstration of defects in Rutherford et al 2005, Mann et al 2005 and Mann et al 2007 all relate to errors introduced by inconsistencies between the standardization methods and the assumptions of the algorithm. Jean S has pointed out further defects that have not yet been incorporated in the PRL.

Jeff Id and I have been exchanging digital files. Jeff has provided some intermediates from his Matlab run and I’ve made some progress in emulating this in R. I think that I’ve got a stable emulation of the Truncated TLS operation (Tapio Schneider’s pttls function) and I’ve got a pretty close emulation of the Xmis object in the first step of the RegEM operation and am getting close to the first step X object.

Along the way, I noticed something which may prove of interest – tho I caution that this is just a diarized note for follow up at present.

One of the fundamental assumptions of Tapio Schneider RegEM is data is missing at random. This is clearly untrue for Steig’s data. However, this sort of failure doesn’t necessarily mean the algorithm is worthless – but careful authors would surely have discussed this issue and its potential impact.

One of the key non-randomness is that some series are missing a LOT more data than other series. It looks like this could have a material effect on weights through an interesting step in the algorithm.

Schneider’s algorithm first centers the data (Matlab code below):

[X, M] = center(X); % center data to mean zero

Then missing data is set at 0.

X(indmis) = zeros(nmis, 1); % fill missing entries with zeros

A little further on, the series are divided by their standard deviations:

% scale variables to unit variance

Do you see what happens? If a series has a LOT of missing data, the missing data is all set at 0 for the purposes of the standard deviation. Could a potential bias arise? Here’s a plot of the standard deviation after first step infilling. It’s interesting that the infilled standard deviations of inland stations are so much lower than coastal stations, though there’s no reason to believe that this is the case physically. It’s only because the inland stations are missing more data – which is thus set to zero and reduces the standard deviation. This might reduce the weighting of the inland stations relative to the Peninsula stations if it persists through later steps. I don’t see any code right now that would appear to compensate for this (but I haven’t finished yet.) There’s no reason why there would be such code. Schneider assumes that data is missing at random, while Steig and Mann ignore this caveat. I’ll keep you posted as I continue to work through my emulation of Schneider’s RegEM.

UPDATE Feb 17:
Here’s the same plot after step 21 which is fairly close to convergence. The variations in std dev still appear unphysical.

UPDATE Feb 19:
Tapio Schneider says above:

The initial variance estimate is biased because missing values are set to the mean (zero after centering), but the estimate is iteratively improved. (One can show that the estimate in the ordinary EM algorithm converges monotonically to the maximum likelihood estimate for normal data; in the regularized EM algorithm, this monotonic convergence is assured for a fixed regularization parameter but not necessarily in general, when regularization parameters are chosen adaptively.)

The following diagram of PCs is from the AVHRR data. Note the structure of the PC3. PCs recover patterns. The pattern being recovered in the PC3 is surely the infilling of data with zeros.

So notwithstanding Tapio’s comment, it sure looks like the infilling with zero pattern is preserved in the final result. As a caveat, while Steig said loudly that he used the unadorned Schneider algorithm, this statement contradicts a statement in the article that they used the algorithm as “adapted” by Mann and Rutherford. Previous adaptations by these authors have been criticized by Smerdon in three publications and it is possible that the phenomenon observed here is related to the adaptations rather than the Schneider algorithm itself. Without knowing what Steig really did – and he’s incommunicado – there’s still a lot of speculation here.