Beckers and Rixen 2003 url is an interesting read in two respects:
1) they present a non-RegEM infilling approach. The method appears to be exactly the same as one that I (independently) implemented and illustrated about a month ago – what I termed “truncated PC”. This was actually the very first thing that I did in climate science, as I used this sort of method in 2003 to try to calculate temperature PCs when there was missing data, in that case applying notes from Sam Roweis.
2) their key example is infilling of missing AVHRR data for the Adriatic Sea.
Beckers and Rixen infilling proceeds as follows. They do a trial infill of missing data with monthly means from the available data. They then do a PC decomposition of the trial matrix. They retain k PCs and eignevectors and expand these to obtain an estimate of the full matrix. In the next iteration, they infill with the estimates of the missing data. The process converges quite quickly and stops when the matrices are close enough together. RegEM operates line-by-line with a huge expansion of the number of operations. I find it hard to understand what advantages line-by-line RegEM truncated TTLS has over truncated PC. There’s nothing in Schneider that deals with this specifically.
For reference, I noticed that there are now a number of canned packages in R for imputing missing values and a website devoted to this issue, which provides many references, including Schneider 2001, though not the Mannian corpus.
The Beckers and Rixen example of Adriatic data is shown below – and, in this case, there is quite a bit of cloud data to be infilled. I presume that something similar had to be done in the Antarctic by Comiso.
Comiso appears to have done a series of Antarctic cloud masking exercises, with substantial changes from version to version and with Steig et al 2009 being the most recent version (assuming that Comiso did it). The Antarctic infilling seems to be a bit more complicated than the Adriatic because it seems that cloud temperatures can be higher than surface temperatures, adding a substantial layer of complexity to the problem of deconvolving cloud and surface measurements.