[shakes head in disbelief]

Peter D. Tillman

Consulting Geologist, Arizona and New Mexico (USA)

if we werent doing what we are doing, we be doing something else, which might be optimal if what we were doing happened to fit the optimality conditions.

… which is nice since this purely hypothetical advantage is not offered by other current CFR methods (and also we might bull a reviewer or two with these fancy sounding sentences)🙂

]]>tree growth doesnt drive the climate

There could be feedback but, yes, it’s unlikely that tree ring data would represent a cause in Pearl’s sense.

thus was born regEM. Chances are (just a guess on my part) people developing computer algorithms to fill in random holes in data matrices werent thinking about tree rings and climate when they developed the recursive data algorithm

One of the pitfalls with the EM algorithm. Its results are best used when the data omissions are not meaningful ,i.e., caused by random events such as coding error vs. say data collection ceasing upon subject’s death. A second pitfall is failing to realize that it only produces *expected* results by filling in the most likely value. This is useful say when training a Bayes Net but I agree it’s of questionable value for learning something new. One exception though, might be learning the values of a hidden variable.

Speaking of omission, I admit I haven’t read the paper yet. I’ll have to correct that soon. It’s a bit hard to imagine a cause of meaningful omissions in tree ring data.

regularization process introduces a bias in the estimated missing values

Yeah. That’s bizarre. If anything, the missing value is replaced with an estimate biased by the other data.

]]>The claim that RegEM’s properties are “demonstrably optimal in the limit of no regularization” amounts to saying that ridge regression has the advantage that if you don’t do ridge regression it reduces to OLS, and OLS is optimal, in those cases where OLS is the optimal estimator. In other words, if we weren’t doing what we are doing, we be doing something else, which might be optimal if what we were doing happened to fit the optimality conditions.

]]>would raise alarm bells in econometrics

My alarm bells went wild after reading this (my bold):

]]>As explained by Schneider [2001], under normality assumptions, the conventional EM algorithm without regularization converges to the maximum likelihood estimates of

the mean values, covariance matrices and missing values, which thus enjoy the optimality properties common to maximum likelihood estimates [Little and Rubin, 1987]. In the limit of no regularization, as Schneider [2001] further explains, the RegEM algorithm reduces to the conventional EM algorithm and thus enjoys the same optimality properties. While theregularization process introduces a bias in the estimated missing valuesas the price for a reduced variance (the bias/variance trade-off common to all regularized regression approaches), it is advisable in the potentially ill-posed problems common to CFR. Unlike other current CFR methods, RegEM offers the theoretical advantage that its properties are demonstrably optimal in the limit of no regularization.