Nic Lewis on Statistical errors in the Forest 2006 climate sensitivity study

Nic Lewis writes as follows (see related posts here, here)

First, my thanks to Steve for providing this platform. Some of you will know of me as a co-author of the O’Donnell, Lewis, McIntyre and Condon paper on an improved temperature reconstruction for Antarctica. Since then I have mainly been investigating studies of equilibrium climate sensitivity (S) and related issues, since climate sensitivity lies at the centre of the AGW/dangerous climate change thesis. (The equilibrium referred to is that of the ocean – it doesn’t include very slow changes in polar ice sheets, etc.) Obviously, the upper tail of the estimated distribution for S is important, not just its central value.

People convinced as to the accuracy of AO-GCM (Atmosphere Ocean General Circulation Model) simulations may believe that these provide acceptable estimates of S, but even the IPCC does not deny the importance of observational evidence. The most popular observationally-constrained method of estimating climate sensitivity involves comparing data whose relation to S is too complex to permit direct estimation, such as temperatures over a spatio-temporal grid, with simulations thereof by a simplified climate model that has adjustable parameters for setting S and other key climate properties. The model is run at many different parameter combinations; the likelihood of each being the true combination is then assessed from the resulting discrepancies between the modelled and observed temperatures. This procedure requires estimates of the natural spatio-temporal covariability of the observations, which are usually derived from AO-GCM control runs, employing an optimal fingerprinting approach. A Bayesian approach is then used to infer an estimated probability density function (PDF) for climate sensitivity. A more detailed description of these methods is given in AR4 WG1 Appendix 9B, here.

I have concentrated on the Bayesian inference involved in such studies, since they seem to me in many cases to use inappropriate prior distributions that heavily fatten the upper tail of the estimated PDF for S. I may write a future post concerning that issue, but in this post I want to deal with more basic statistical issues arising in what is, probably, the most important of the Bayesian studies whose PDFs for climate sensitivity were featured in AR4. I refer to the 2006 GRL paper by Forest et al.: Estimated PDFs of climate system properties including natural and anthropogenic forcings, henceforth Forest 2006, available here, with the SI here. Forest 2006 is largely an update, using a more complete set of forcings, of a 2002 paper by Forest et al., also featured in AR4, available here, in which a more detailed description of the methods used is given. Forest 2006, along with several other climate sensitivity studies, used simulations by the MIT 2D model of zonal surface and upper-air temperatures and global deep-ocean temperature, the upper-air data being least influential. Effective ocean diffusivity, Kv, and total aerosol forcing, Faer, are estimated simultaneously with S. It is the use of multiple sets of observational data, combined with the joint estimation of three key uncertain climate parameters, that makes Forest 2006 stand out from similar Bayesian studies.

Forest completely stonewalled my requests to provide data and code for over a year (for part of which time, to be fair, he was recovering from an accident). However, I was able to undertake a study based on the same approach as in Forest 2006 but using improved statistical methods, thanks to data very helpfully made available by the lead authors, respectively Charles Curry and Bruno Sanso, of two related studies that Forest co-authored. Although Forest 2006 stated that the Curry et al. 2005 study used the Forest 2006 data (and indeed relied upon that study’s results in relation to the surface dataset), the MIT model surface temperature simulation dataset for Curry et al. 2005 was very different from that used in the other study, Sanso et al. 2008. The Sanso et al. 2008 dataset turned out to correspond to that actually used in Forest 2006. The saga of the two conflicting datasets was the subject of an article of mine posted at Judith Curry’s blog Climate Etc this summer, here , which largely consisted of an open letter to the chief editor of GRL. Whilst I failed to persuade GRL to require Forest to provide any verifiable data or computer code, he had a change of heart – perhaps prompted by the negative publicity at Climate Etc – and a month later archived the complete code used for Forest 2006, along with semi-processed versions of the relevant MIT model, observational and AO-GCM control run data – the raw MIT model run data having been lost. Well done, Dr Forest. The archive (2GB) is available at .

The code, written in IDL, that Forest has made available is both important and revealing. Important, because all or much of it has been used in many studies cited, or expected to be cited, by the IPCC. That includes, most probably, all the studies based on simulations by the MIT 2D model, both before and after AR4. Moreover, it also includes a number of detection and attribution studies, the IPCC’s “gold standard” in terms of inferring climate change and establishing consistency of AO-GCM simulations of greenhouse gas induced warming with observations. Much of the core code was originally written by Myles Allen, whose heavily-cited 1999 Allen and Tett optimal fingerprinting paper, here, provided the statistical theory on which Forest 2006 and its predecessor and successor studies were based. Allen was a co-author of the Forest et al. 2000 (MIT Report preprint version of GRL paper here), Forest et al. 2001 (MIT Report preprint version of Climate Dynamics paper here) and Forest et al. 2002 studies, in which the methods used in Forest 2006 were developed.

The IDL code is revealing because it incorporates some fundamental statistical errors in the derivation of the likelihood functions from the model-simulation – observation discrepancies. The errors are in the module (written by Forest or under his direction, not by Allen, I think) that computes the relevant likelihood function and combines it with a specified initial distribution (prior) using Bayes theorem. There are three likelihood functions involved, one for each of the three “diagnostics” – surface, upper-air, and deep-ocean, which involve respectively 20, 218 and 1 observation(s). The bayesupdatenew2 module is therefore called (by module three times, if an “expert” prior is being used. Where the prior used is uniform, on the first pass bayesupdatenew2 also computes a likelihood for a second set of discrepancies and uses that as a “data prior”, so the module is only called twice.

Each likelihood is based on the sum of the squares of the ‘whitened’ model-simulation – observation discrepancies, r2. Whitening involving transforming the discrepancies, using an estimate of the inverse spatio-temporal natural variability covariance matrix, so that they would, in a perfect world, be independent standardised normal random variables. The likelihoods are computed from the excess, delta.r2, of r2 over its minimum value, minr2 (occurring where the model run parameters provide the best fit to observations), divided by m, the number of free model parameters, here 3. The statistical derivation implies that delta.r2/m has an F_m,v statistical distribution, in this case that delta.r2/3 has an F_3,v distribution, v being the number of degrees of freedom in estimating the spatio-temporal natural variability covariance matrix. The math is developed in Forest et al. 2000 and 2001 out of that in Allen and Tett 1999, and I will not argue here about its validity.

The statistical errors I want to highlight are as follows:

(a) the likelihoods are computed using (1 minus the) cumulative distribution function (CDF) for a F_3,v(delta.r2/3) distribution, rather than its probability density function. A likelihood function is the density of the data, viewed as a function of the parameters. Therefore, it must be based on the PDF, not the CDF, of the F distribution. The following code segment in bayesupdatenew2 incorporates this error:

r2 = r2 – minr2 +1e-6
nf = 3
pp= 1.- f_pdf(r2/float(nf),nf,dof1)

where dof1 is the degrees of freedom used for the diagnostic concerned. Despite the name “f_pdf”, this function gives the F-distribution CDF.

(b) the same calculation, using the F_3,v(delta.r2/3) function, is used not only for the surface and upper-air diagnostics, but also for the univariate deep-ocean diagnostic. The use of the F_3,v distribution, with its argument being delta.r2/3, is based on delta.r2 being, when the model parameter settings are the true ones, the sum of the squares of 3 independent N(0,1) distributed random variables. That there are only 3 such variables, when the diagnostic involves a larger number of observations and hence whitened discrepancies, reflects the higher dimensional set of whitened discrepancies all having to lie on a 3D hypersurface (assumed flat) spanned by the parameters. However, when there is only one data variable, as with the deep-ocean diagnostic (being its temperature trend), and hence one whitened discrepancy, delta.r2 is the square of one N(0,1) variable, not the sum of the squares of three N(0,1) variables. Therefore, the deep-ocean delta.r2, divided by 1 not 3, will have a F_1,v distribution – implying the unsquared whitened deep ocean discrepancy r will have a Student’s t_v distribution. Here, delta.r2 = r2, since a perfect fit to a single data point can be obtained by varying the parameters, implying minr2 = 0.

(c) there is a statistical shortcoming in the Forest et al 2006 method in relation to the use of an F-distribution based PDF as a likelihood function. A geometrical correction to the F-distribution density is need in order to convert it from a PDF for the sum of the squares of three N(0,1) distributed variables to a joint PDF for those three variables. The appropriate correction, which follows from the form of the Chi-square distribution PDF, is division of the F-distribution PDF by sqrt(delta.r2). Without that correction, the likelihood function goes to zero, rather than to a maximum, at the best-fit point.

There may be many other errors in the IDL code archive – I’ve identified a couple. Any readers who are familiar with IDL and have the time might find it interesting to study it, with a view to posting any significant findings – or even to rewriting key parts of it in R.

As it happens the F_3,v PDF is not that different from {1 – CDF} once the parameter combination is well away from the best fit point, so the effect of error (a) is not very substantial. Nevertheless, that Forest made this error – and it was not a mis-coding – seems very surprising.

The effect of (c), which is partially compensated by error (a), is likewise not very substantial.

However, the difference in the behaviour of the likelihood function resulting from error (b) is substantial; the ratio of the Forest 2006 to the correct likelihood varies by approaching 3x as the parameters are moved away from the best fit point. And the deep-ocean likelihood is what largely causes the estimated PDF for S ultimately to decline with increasing S: the two other diagnostics provide almost no constraint on very high values for S.

In addition to the Forest 2002 and 2006 papers, I believe these errors also affected the Forest et al. 2008 Tellus A and the Libardoni and Forest 2011 GRL papers, and probably also 2009 and 2010 papers lead authored by Forest’s regular co-author Sokolov. It is to be expected that there will be multiple citations of results from these various studies in the AR5 WG1 report, . I put it to Myles Allen – who seems, along with Gabi Hegerl, to be the lead author of Chapter 10 primarily responsible for the sections relating to climate sensitivity – that in view of these serious statistical errors, results from the affected papers should not be cited in the IPCC report. However, whilst accepting that the errors were real, he expressed the view that the existence of these statistical errors didn’t really matter to the results of the papers concerned. His reasoning was that only error (b) had a potentially substantial effect, and that didn’t much matter since there was anyway considerable uncertainty in the ocean data that the studies used. I’m not sure that I agree with this approach.

I would be surprised if the basic statistical errors in the IDL code do not significantly affect the results of some or all of the papers involved. I would like to test this in regard to the Forest 2006 paper, by running the IDL code with the errors corrected, in time to put on record in my “expert reviewer” comments on Chapter 10 of the Second Order Draft of IPCC AR5 WG1 report what the differences in Forest 2006’s results resulting from correcting these errors are, if significant. At least Myles Allen and Gabi Hegerl will then be aware of the size of the differences when deciding whether to ignore them.

Unfortunately, IDL seems to be a very expensive product – the company selling it won’t even give any pricing information without many personal details being provided! There is an open source clone, GDL, which I have tried using, but it lacks too much functionality to run most of the code. I’ve implemented part of the IDL code in R, but it would take far too long to convert it all, and I couldn’t be confident that the results would be correct.

So, my question is, could any reader assist me by running the relevant IDL code and letting me have the results? If so, please can you email me via Steve M. The code in the GRL2006_reproduce archive should be fully turnkey, although it is written for an old version of IDL. I can supply an alternative, corrected version of the module and relevant information/instructions.

In case any of you are wondering, I submitted a paper to Journal of Climate in July detailing my reanalysis of the Forest 2006 datasets, focussing on improving the methodology, in particular by using an objective Bayesian approach. That was just before Forest released the IDL code, so I was unaware then that he had made serious statistical errors in implementing his method, not that they directly affect my paper.

Nicholas Lewis

For the convenience of readers who would like to look at the code without downloading a 2GB archive file, it is as follows:

pro bayesupdatenew2,prior,data,post,expert=expert,hik=hik,mtit=mtit,usegcms=usegcms,alf=alf,dt=dt,indiv=indiv,yrng=yrng,$

if (n_params() eq 0) then begin
  print, 'Usage: bayesupdatenew2,prior,newp,post,expert=expert,$'
  print, '   hik=hik,mtit=mtit,usegcms=usegcms,alf=alf,dt=dt,$'
  print, '   indiv=indiv,yrng=yrng,label=label,dataprior=dataprior,$'
  print, '   dof1=dof1,dof2=dof2,noplot=noplot,igsm2=igsm2,mcmc=mcmc,i2post=i2post'
  print, ' prior = a priori estimate of pdf'
  print, ' data = r2 data used to estimate new pdf'
  print, ' post = a posteriori estimate of pdf'
  print, ' dataprior = 1 if using r2 values for first prior'
  print, ' dof1, dof2 = degreees of freedom for likelihood estimators'
  print, ' dof2 = used for dataprior'
  print, 'i2post = 1, use igsm2 posterior for aerosol prior'

if (not keyword_set(yrng)) then yr = [0,10.] else yr = yrng
if (not keyword_set(dof1)) then dof1 = 12
if (not keyword_set(dof2)) then dof2 = 12 

;set colors

; prior - taken from somewhere old posterior or expert judgement
; data - provided by new diagnostic
; post - returned by updating procedure

kk= findgen(81)*.1 & ss = findgen(101)*.1 +.5 & aa = -findgen(41)*.05+0.5

np = n_elements(data)
;print, 'np = ',np

; create probability from r2 values
r2 = data
pp = fltarr(np)

r2neg = where(r2 le 0., nu)
print, 'Number of negative points=', nu

if (nu ne 0) then r2(r2neg) = 0.

if (keyword_set(igsm2)) then begin
  minr2 = min(r2(1:50,0:75,0:40)) 
  print, 'Minrp2 in igsm2 domain:',minr2
endif else minr2 = min(r2)

print, 'minr2 =',minr2
print, 'Range r2:',range(r2)

r2 = r2 - minr2 +1e-6
nf = 3
;dof = 12
;for i = 0,np-1 do pp(i)=  1.- f_pdf(r2(i)/float(nf),nf,dof)
print, 'Computing p(r2) for HUGE data vector'
pp=  1.- f_pdf(r2/float(nf),nf,dof1)
help, dof1


;interpolate prior to r2 points
; note: no prior for aa = alpha
if (keyword_set(expert)) then begin

; returns 3d joint prior on r2interp grid
;  priorx = krange & priory = srange & priorz = jprior = jointprior
endif ;else begin
;  priorx = kk & priory = ss & priorz = prior 

if (not keyword_set(expert) and  keyword_set(dataprior)) then begin
    r2p = prior 
    r2pneg = where(r2p le 0., nup)
    print, 'Number of negative points=', nup
    if (nup ne 0) then r2p(r2pneg) = 0.

    print, 'Range(r2p)', range(r2p)
    if (keyword_set(igsm2)) then begin
      print, 'Keyword set: igsm2'
      minr2p = min(r2p(1:50,0:75,0:40)) 
      print, 'minr2p ',minr2p
    endif else minr2p = min(r2p)
    r2p = r2p - minr2p + 1e-6
    print, 'Computing p(r2) for HUGE prior vector'

    prior2 =  1.- f_pdf(r2p/float(nf),nf,dof2)
    print,'Range prior2 before ', range(prior2)

    help, dof2
    if (keyword_set(igsm2)) then begin
      prior2(0,*,*) = 0.        ;KV = 0.
      prior2(51:80,*,*) = 0.    ;KV > 25.
;      prior2(*,76:100,*) = 0.   ;CS > 8.
      prior2(*,76:100,*) = 0.   ;CS > 8.
;      prior2(*,*,0:4) = 0.      ;FA > 0.25 
    endif else begin
      prior2(0,*,*) = 0.
      prior2(*,96:100,*) = 0.
endif else prior2 = prior


; multiply probabilities
post = prior2 * pp

; interpolate to finer grid to compute integral
; separate into aa levels
nk = findgen(81)*.1
nk = nk*nk
ns = findgen(101)*.1 + 0.5

; estimate integral to normalize

ds = 0.1 & dk = 0.1 & da = 0.05
;totpl = fltarr(6) ; totpl = total prob at level aa(i)
;for i =0,5 do totpl(i) = total( smpostall(i,*,*) )/(8.*9.5*2.0)   

;where intervals are k=[0,8], s=[0.5,10.], a=[0.5,-1.5]
totp = total(post)/(8.*9.5*2.)


;normalize here
post = post/totp
;print, post

smpostall = post

if (not keyword_set(noplot)) then begin 

;estimate levels for contour from volume integral
clevs = c_int_pdf3d([0.99,.9,.8],smpostall)
;clevs = c_int_pdf3d([0.90],smpostall)
rr= range(smpostall)
print, 'range post:', rr(1) -rr(0)
;clevs = findgen(3)*(rr(1) - rr(0))/4+min(smpostall)
print,' clevs =', clevs
if (not keyword_set(indiv)) then !p.multi=[0,2,4] else !p.multi=0
pmax = max(post)
;print,'max(post), imax', max(post), where(post eq max(post))
ccols = [indgen(3)*50 + 99,255]

pane = ['(a) : ','(b) : ','(c) : ','(d) : ','(e) : ','(f) : ','(g) : ']
titl= ['F!daer!n = 0.5 W/m!u2!n','F!daer!n = 0.0 W/m!u2!n','F!daer!n = -0.25 W/m!u2!n',$
       'F!daer!n = -0.5 W/m!u2!n','F!daer!n = -0.75 W/m!u2!n',$
       'F!daer!n = -1.0 W/m!u2!n','F!daer!n = -1.5 W/m!u2!n']

alevs = [0,10,15,20,25,30,40]
for i = 0,6 do begin
  ii =  alevs(i)
  ka = nk & sa = ns

  contour, post(*,*,ii), sqrt(ka), sa, $
  levels=clevs,c_labels=0,/cell_fill, c_colors=ccols, $
  title=pane(i)+mtit+' : '+titl(i),$
  xtitle='Effective ocean diffusivity [cm!u2!n/s]',$
  ytitle='Climate Sensitivity [!uo!nC]', $
  xticks = 8, xtickv=findgen(9),$
  xtickname=strmid((findgen(9))^2,6,4); , charsize=chsz

  contour, post(*,*,ii), sqrt(ka), sa,/overplot, $
  imax = where(post(*,*,ii) eq pmax, icount)
  ix = where(post(*,*,ii) eq max(post(*,*,ii)))
;  print, imax, ix
  if (icount ne 0) then oplot,[sqrt(ka(imax))],[sa(imax)],psym=sym(1) else $
  oplot, [sqrt(ka(ix))],[sa(ix)], psym = sym(6)
;  for j=0,ni-1 do  oplot, [sqrt(ka(j))],[sa(j)], psym = sym(11)
  if (keyword_set(usegcms)) then begin
    if (keyword_set(label)) then $

  if (keyword_set(dt)) then begin
      dtlabs = replicate(1,31)
      dtlevs =  findgen(31)/20.
      dr =  range(dtdata.dt)
      ddr = dr(1) - dr(0)
      if (ddr lt 0.5) then dtlevs = findgen(31)/50.
      contour,dtdata.dt,sqrt(dtdata.k),dtdata.s,/overplot, levels=dtlevs,$ 
        c_labels=dtlabs, thick=5
      contour,dtdata.dt,sqrt(dtdata.k),dtdata.s,/overplot, levels=[obs],$ 
        c_labels=dtlabs, thick=5, c_linestyle=5

if (keyword_set(alf)) then begin
  na = -findgen(41)*.05+0.5
  set_plot, 'ps'
  device, /encapsulated, /helvetica, font_size=18
  device,/color,bits_per_pixel=8,xsize=20, ysize=5./7.*20
  i = where( na eq alf, nl)
  i = i(0)
  if (nl lt 1) then print, 'No matching aerosol forcing' else begin
    ii = where( na eq alf,ni)
    ka = nk & sa = ns
    contour, post(*,*,ii), sqrt(ka), sa,$
    xtitle='Rate of Ocean Heat Uptake [Sqrt(K!dv!n)]',ytitle='Climate Sensitivity (K)',title=mtit+' : '+titl(i),$
    levels=clevs,c_labels=0,/cell_fill, c_colors=ccols
    contour, post(ii), sqrt(ka), sa,/irregular,/overplot, $
    if (keyword_set(usegcms)) then begin
;    xyouts,sqrt(gcms(0,*))+.15,gcms(1,*),nms,charsize=chsz

    if (keyword_set(dt)) then begin
      dtlabs = replicate(1,31)
      dtlevs =  findgen(31)/20.
      dr =  range(dtdata.dt)
      ddr = dr(1) - dr(0)
      if (ddr lt 0.5) then dtlevs = findgen(31)/50.
      contour,dtdata.dt,sqrt(dtdata.k),dtdata.s,/overplot, levels=dtlevs,$ 
        c_labels=dtlabs, thick=5
      contour,dtdata.dt,sqrt(dtdata.k),dtdata.s,/overplot, levels=[obs],$ 
        c_labels=dtlabs, thick=5, c_linestyle=5


  device, /close,color=0,encapsulated=0
  set_plot, 'x'
  !p.font = -1




  1. Anthony Watts
    Posted Nov 8, 2012 at 8:06 AM | Permalink

    Thanks Nic. Some links are broken in paragraph 3 where you say “here” but there are no links to the source.

    • NicL
      Posted Nov 8, 2012 at 9:04 AM | Permalink

      Thanks, Anthony. Hopefully all the missing links will be fixed shortly.

  2. redcords
    Posted Nov 8, 2012 at 8:58 AM | Permalink

    I’ve never been impressed with most of the code I’ve seen from climate science. It always looks like they hack it until it runs. eg all these commented out lines from where they had to check what it was doing or get it past a problem, all these unexplained constants, almost nothing commented/documented, alternative results from other runs left in and commented out, the 5 “stops” commented out.

    It looks like they never create anything resembling a formal specification beforehand of what the code has to do, they just hack away at it until it runs and they think it’s giving the right answer. So it’s not a surprise to me that in this case a closer look from Nic could reveal problems.

    They read like a cross between the output of a console session and an actual program. Steve’s posted R code sometimes looks like this too but I can understand that when it’s written to support a blog post.

    My 2 favourite lines from
    ; prior2(*,76:100,*) = 0. ;CS > 8.
    prior2(*,76:100,*) = 0. ;CS > 8.

    • redcords
      Posted Nov 8, 2012 at 9:11 AM | Permalink

      To be even more cynical (if that’s possible) let’s look at the filename. Whenever I write code I always name my first attempt XYZnew2.


      • Steven Mosher
        Posted Nov 8, 2012 at 1:41 PM | Permalink

        haha. I have a bunch of routines named ‘final3’ ‘final5’ ‘finalfinal2’

        never name something new or final. wish I could practice what I prreach.

        • Posted Nov 8, 2012 at 2:01 PM | Permalink

          When I was using Java more than I do today and trying to refactor some dense jungles of code I suggested that some classes should be renamed things like ObstructiveBureaucrat, to make clear where the bad smells really lay.

        • Jeff Condon
          Posted Nov 8, 2012 at 10:27 PM | Permalink

          I also, never seem to learn!!!

        • redcords
          Posted Nov 9, 2012 at 5:25 AM | Permalink

          Richard Drake’s Java mention is useful.

          Sun’s slogan for Java was “Write once, run anywhere”, Climate Science often appears to be “Run once, publish anywhere”.

  3. diogenes
    Posted Nov 8, 2012 at 9:52 AM | Permalink

    “the existence of these statistical errors didn’t really matter to the results of the papers”

    It is almost axiomatic in climate science, isn’t it, that errors never matter?

  4. Posted Nov 8, 2012 at 10:13 AM | Permalink

    Just one question. Isn’t it rather chapter 12 which deals with climate sensitivity. Then Matthew Collins and Reto Knutti are the coordinating lead authors.

    • zewig
      Posted Nov 8, 2012 at 10:39 AM | Permalink

      I think discussion of climate sensitivity is split between chapters 10 and 12, with rather more detail in chapter 10, particularly on instrumental, observationally-constrained studies. I sent copies of my submitted Journal of Climate paper to Reto Knutti and to Matthew Collins as well as to Peter Stott, but neither of them acknowledged receipt.

  5. janekerr
    Posted Nov 8, 2012 at 10:32 AM | Permalink

    Why don’t we organise a whip-round to buy Nic a copy of IDL?

    • Heretic
      Posted Nov 8, 2012 at 10:45 AM | Permalink

      Good idea!

    • NicL
      Posted Nov 8, 2012 at 11:41 AM | Permalink

      Thanks for the suggestion, but I don’t think it would be right to expect others to contribute to something I can well afford myself. However, the issue is also about experience progamming in IDL, which I lack.

  6. Heretic
    Posted Nov 8, 2012 at 10:43 AM | Permalink

    FWIW: effective January 1, 1999 pricing for IDL on the PC platform (Windows, Macintosh, LINUX, X86) platform will increase from $1500 to $1895 per license, as well, maintenance per year will increase to $250.

  7. Les Johnson
    Posted Nov 8, 2012 at 10:53 AM | Permalink

    As for the costs of the IDL code? Just skim some off the top of the Fossil Fuel Checks.


  8. Heretic
    Posted Nov 8, 2012 at 11:10 AM | Permalink

    Cheaper alternative as long as one is (or can find!) a student:

    From Google Groups:

    I looked at the Exelis website yesterday and found that the student edition of IDL is $89 ($149 with the IMSL library). That didn’t seem too bad until I realized that that price is for a 1-year license!
    Kenneth P. Bowman

    Hi all,
    The student version is identical to the full version, just time-limited to a year. So you can create all of the publication-quality graphics that you want. As long as you are a student.

    • zewig
      Posted Nov 8, 2012 at 11:30 AM | Permalink

      The student price only applies to students at North American academic institutions, I believe.

    • Ivan
      Posted Nov 8, 2012 at 11:46 AM | Permalink

      I am a student at a North American University (not in climate or related sciences) and am willing to help.

      • Heretic
        Posted Nov 8, 2012 at 4:41 PM | Permalink

        I’m more than willing to fund at least 50% of this, if it helps.
        (the IP address may be dodgy [obscured/hidden], but the eMail address works).

  9. ConfusedPhoton
    Posted Nov 8, 2012 at 11:13 AM | Permalink

    Have you tried converting IDL to Matlab e.g. IDL2Matlab?
    I know converting the code may introduce some further problem but at least it would be in a form which could be run under Matlab or Octave

    • NicL
      Posted Nov 8, 2012 at 11:32 AM | Permalink

      Thanks – I hadn’t come across IDL2Matlab. I assume that runs under Matlab? I might try it, although my frustrating experiences trying to run the IDL code under the supposed clone GDL don’t bode well.

  10. David S
    Posted Nov 8, 2012 at 11:15 AM | Permalink

    Extraordinary that they will not release their prices. Perhaps they charge everyone a different price depending what they think the market will stand. Would certainly contribute in $ or £ to a whip round for Nic to get a copy if he would like.

    • Duster
      Posted Nov 8, 2012 at 5:44 PM | Permalink

      Re: David S (Nov 8 11:15),

      Many companies that offer “high end” software and hardware seem to be very careful about exposing things like cost structures. They want to “help” tailor “your system” to “your needs.” Complicating this irritating situation are government specifications that pretty much demand that a provider spring for the costs of special software and hardware up front. GPS and GIS systems are particularly aggravating.

  11. Carrick
    Posted Nov 8, 2012 at 11:36 AM | Permalink

    I’d use try the demo version of the program before I went any further.

    If there are incompatibilities with the current IDL platform and this software, that would tell you immediately whether you need to go further.

    Beyond that, I’d extend the offer of coauthorship to Forest himself, and explain your needs. He may tell you “no”, but if he’s like me, he’d suck up his pride and try and get it right, and welcome the opportunity to improve the statistical model.

    I’m glad that Allen accepts that errors were made. Like Nic I’m not sure I like the idea of including papers with known, not fully characterized errors in them. That’s just sloppy.

    • NicL
      Posted Nov 8, 2012 at 11:54 AM | Permalink

      I wasn’t aware that there was a demo version of IDL – I can’t see any mention of one on the exelisvis website?

      The most important of the improvements that I have made to the statistical model isn’t fixing (a) to (c) above – I wasn’t aware that Forest 2006 had made errors (a) and (b). Rather, it involves use of a computed joint noninformative prior (Jeffreys’ prior) for the Bayesian statistical inference rather than a joint uniform prior or a mixed uniform / ‘expert’ prior. Until the climate science community routinely uses noninormative priors for Bayesian studies of climate system parameters, the results of many of those studies are likely to be significantly biased.

      • Carrick
        Posted Nov 8, 2012 at 3:18 PM | Permalink

        I found a link here to the demo version for the Mac. I’ve registered for the site, but am not a “verified user” yet.

        I was suggesting Forest based on this comment from your post:

        The IDL code is revealing because it incorporates some fundamental statistical errors in the derivation of the likelihood functions from the model-simulation – observation discrepancies. The errors are in the module (written by Forest or under his direction, not by Allen, I think) that computes the relevant likelihood function and combines it with a specified initial distribution (prior) using Bayes theorem.

        If he wrote the original code, and it contains “fundamental statistical errors”, he might be interested in correcting the code and seeing what affect this how on the outcome.

        In any case he obviously has IDL and is familiar with the code in question, so he could apply corrections and run an updated version. That would certainly add weight to any paper written specifically to address these statistical arguments. Just a thought…

        • NicL
          Posted Nov 9, 2012 at 1:05 PM | Permalink

          I see the demo version for the Mac says:

          “The Trial version of IDL will run in a restricted trial mode for seven-minute sessions”

          Remarkably generous!!

        • Carrick
          Posted Nov 11, 2012 at 4:07 PM | Permalink

          Yep 7 minutes. I never heard back from them either, so I still can’t download it.

          I think I’ll stick to Matlab.

    • NicL
      Posted Nov 8, 2012 at 12:56 PM | Permalink

      The code should be virtually turnkey, and most of it originally ran on the Oxford Physics system, so that sounds a very interesting suggestion, thank you. Am I right in thinking that you are the quantum computation expert? Assuming so, I will, if I may, email you per the contact details on your webpage to discuss this possibility further.

  12. Heather Brown (aka Dartmoor Resident)
    Posted Nov 8, 2012 at 11:41 AM | Permalink

    Count me in as another who will contribute to the cost of buying a licence if this would help

    • Posted Nov 8, 2012 at 12:03 PM | Permalink

      It’s not the cost of the license but the cost in time of becoming fluent in IDL, from what Nic says. These people should use open source languages like R. But that’s another story. This is brilliant work just from inspection of the code. That’s why the code itself must always be made available.

      • seanbrady
        Posted Nov 9, 2012 at 4:03 PM | Permalink

        How about doing a whip-round to fund hiring a computer science student to take on the project? There’s probably a lot of IT-expert statistics-literate graduade students in India. Maybe post a notice on a jobs bulleting board at IIT. You could probably get a real professional job done for a very reasonable price, and you’d be helping a student fund his or her education. I’d be happy to contribute.

  13. Posted Nov 8, 2012 at 12:00 PM | Permalink

    It looks like I can run IDL under the Oxford Physics site license. But I have no experience in running IDL so I probably can’t be much help unless the code is close to turn key.

    • NicL
      Posted Nov 8, 2012 at 1:03 PM | Permalink

      Sorry, I posted my response to Jonathan Jones under Carrick’s comment, two above, by mistake.

      • Posted Nov 8, 2012 at 5:22 PM | Permalink

        Got your email, thanks. Reply to follow by email soon.

      • Hector Pascal
        Posted Nov 8, 2012 at 8:16 PM | Permalink

        I have IDL V5.3 (approx 2003), but haven’t run it for nearly 10 years. That probably doesn’t help.

        ENVI (Environment for Visualising Images) runs on IDL. It is industry standard software for remote sensing work. If you have a remote sensing group, that’s a good place to start looking for IDL expertise.

        • NicL
          Posted Nov 9, 2012 at 8:29 AM | Permalink

          Many thanks. Actually, IDL V5.3 would quite possibly be fine – most of the code was written in 1999 (some of it for use by the UK Met Office Hadley Centre, it appears). I’ll come back to you if I get stuck, if I may.

  14. Bob K.
    Posted Nov 8, 2012 at 2:07 PM | Permalink


    I’ll just mention one tiny thing, not having read the papers. In a likelihood function, 1 – F(x) = P(X > x) is used when observations are right-censored.

    • Jonathan Bagley
      Posted Nov 9, 2012 at 12:25 PM | Permalink

      That’s what first struck me. The cdf can appear when the data is censored (I haven’t read the papers either).

  15. Brian H
    Posted Nov 8, 2012 at 3:35 PM | Permalink

    The fundamental objectionable switcheroo in such studies of sensitivity etc. is that the lead-in disclaimer is that these simulations are merely “projections”, using various what-if scenarios, and hence cannot be falsified or tested against observations. Then, having established this passepartout, the projections are then mooshed, averaged, selectively grouped, and treated as predictions — on which great and weighty decisions can be based. But (see above) still not falsifiable, still assessed only on the basis of hand-massaged hindcasting.

    It really is a shell game.

    • Steven Mosher
      Posted Nov 8, 2012 at 6:36 PM | Permalink


      ur coming close to blasphemy buckeroo

      • Gras Albert
        Posted Nov 9, 2012 at 6:29 AM | Permalink

        Is it lucky that Mooshe hunting season has just closed?

    • Skiphil
      Posted Nov 8, 2012 at 7:58 PM | Permalink

      Re: Brian H (Nov 8 15:35),

      Moshed is good but mooshed is bad…. mashed and mushed are bad, too, when it comes to data. Only ‘moshed’ can meet with Climate Audit approval. (well meshed may be ok depending upon the statistical standards of the meshing….)

  16. Gerald Browning
    Posted Nov 8, 2012 at 10:06 PM | Permalink

    The slowly evolving large scale component of the atmosphere in the midlatitudes is not two dimensional. Given the three dimensional vertical component of the vorticity (-u_y + v_x), the remaining variables of the slowly evolving component can be computed by solving a two dimensional (x,y) elliptic equation for the pressure p on each vertical (z) level , then a three dimensional elliptic equation for the vertical velocity w, and finally two dimensional (x,y ) elliptic equations for the divergent portion of the wind (u_x + v_y} on each vertical (z) level. A reference pointing this out is available on request (Browning, Kreiss 2002).

    For weather forecasts, the vertical component of vorticity is available from obs and using the above theory, the slowly evolving component can be determined in the troposphere. However, the equations of motion are based on the assumption
    that they only be applied in the troposphere.
    As has been clearly shown in Silvie Gravel’s manuscript, they do not apply in the lower boundary layer, and the errors in the artificial parameterizations that are used there swamp the accuracy of the forecast in the entire domain in a few days.
    For forecast models these errors can be overcome by continually inserting new vertical component
    of vorticity observational data every 6 hours, thus reducing the error that has spread upward from the erroneous boundary layer.

    Notice, however, that the erroneous boundary layer parameterization is present in all climate models and that error is sufficient to invalidate the accuracy of any climate model in a few days.

  17. Jeff Condon
    Posted Nov 8, 2012 at 10:57 PM | Permalink


    What I see contains so much information, that it would literally take me weeks to figure out what you already presented in this post. You are on a unique path.

    I wonder if you would consider some piecemeal posts on the smaller steps of your results. A community approach to science would be an excellent future for science in general.

    • NicL
      Posted Nov 9, 2012 at 8:34 AM | Permalink

      Your suggestion of posts on smaller steps is a good one, which I will try to action when I get some time freed up.

  18. Jean S
    Posted Nov 9, 2012 at 3:34 AM | Permalink


    There is an open source clone, GDL, which I have tried using, but it lacks too much functionality to run most of the code.

    What type of functionality is missing? I’m just asking because there seems to be so much high class programming talent available here that if the missing pieces are not so big, patching GDL by talented CA readers might be a viable option…

    Besides Nic, has anyone tried to run the code with GDL?

    • NicL
      Posted Nov 9, 2012 at 8:19 AM | Permalink

      It is mainly mathematical and statistical functions/prodecures that are missing – quite a lot of them, even simple ones like DIAG_MATRIX, which I thought was implemented in GDL. Maybe the version of GDL I am using (which comes with Fedora, running under VirtualBox) isn’t the most up up to date, although I only downloaded it in August. Maybe I am doing something wrong – I’m unfamiliar with all of VirtualBox, Fedora, GDL and IDL – not good.

      • Steve McIntyre
        Posted Nov 9, 2012 at 10:54 AM | Permalink

        A reader emailed me:

        I had a look at the IDL code in Forest 2006 from the Geophysical Research Letters. The code is turnkey and easy to run, the outputs are unformatted binaries and netcdf files, all together about 110 MB of data, plus a sensitivity plot.

        I can also send you the screen text outputs, I haven’t added them to avoid spam filters due to attached files.

        If you want to add the modification to error b) to test the output, I will be happy to help.

    • Hugo M
      Posted Nov 10, 2012 at 10:14 AM | Permalink

      There is an error in the head post concerning the link to the Forest 2006 archive. The URL should read h ttp:// (“GRL2006_reproduce.tgz” -> “GRL06_reproduce.tgz”)

      Patching the GDL program code itself is probably not necessary. The last time I tried to convert an IDL program into GDL only a handful of original IDL functions were missing. Via google I found them all in the form of original IDL modules somewhere on the internet. Simply download these modules to your working directory.

  19. Charles N
    Posted Nov 9, 2012 at 8:10 AM | Permalink

    From an old business software developer.Are you all saying that we have to look at actual code to determine what the software is doing? If so this is really absurd. The application specifications should document what the software is doing before a line of code is written, or am I missing something? Why would you have to figure out code to determine what the author is doing? Does he not know before he starts coding? If he doesn’t know where he’s going it would be pretty hard for someone else to follow.

    • NicL
      Posted Nov 9, 2012 at 8:25 AM | Permalink

      I think that the code does implement what its author (Chris Forest) intended, but that he didn’t understand what the correct statistical calculations were. And the methods descriptions in journal papers are almost never sufficiently detailed to reveal the existence of such mathematical/statistical mistakes even to the most careful reader.

      • Charles N
        Posted Nov 9, 2012 at 9:03 AM | Permalink

        I’m sure the math and statistics involved are very complex and hard to interpret and calculate, it just seems that this problem of getting data and code from other scientists to determine if they are getting the right answers is one of no standards in the scientific community for documenting and foolproof testing of software before publishing results. I don,t understand why this can’t be done, but that’s an issue for another day.

        • Carrick
          Posted Nov 9, 2012 at 11:44 AM | Permalink

          Interesting that you bring this up.

          Yesterday, I was having problems getting a USB device to “jee-haw” with the USB Host port on a single-board computer running an embedded Linux system.

          I solved the problem by looking at the kernel source.

          In cases where you can’t resolve the issue and don’t have source available, what are you options? Use a different USB device, a different SBC, or both.

          It seems the places where source availability affect outcome is more general than just scientific software.

        • Steven Mosher
          Posted Nov 9, 2012 at 3:13 PM | Permalink

          carrick, now you are kernel hacking.
          next you’ll whip out the soldering iron.
          dont brick your system.

        • Posted Nov 9, 2012 at 3:20 PM | Permalink

          Not quite, he didn’t say he changed his kernel but that by looking at the source he was able to work out how to get his USB device working. A great example of the power of open source. And, as usual, the ‘lurkers’ who only read far outnumber the dullards like me who try and write something as well. But those that lurk only can greatly benefit, that’s the point.

      • RomanM
        Posted Nov 9, 2012 at 9:16 AM | Permalink


        As it happens the F_3,v PDF is not that different from {1 – CDF} once the parameter combination is well away from the best fit point, so the effect of error (a) is not very substantial.

        Although what you say is true, this does not take into account the frequency of the sample values. In reality, the values of r2 in pp = 1.- f_pdf(r2/float(nf),nf,dof1) which are “well away from the best fit point” will occur less frequently than those where the difference is greater.

        If X is a random variable with a continuous cumulative distribution function G(x) and probability density function g(x), then it is well known that the random variable Y = G(X) (and therefore also Z = 1 – G(X)) has a uniform distribution on the interval [0,1]. Thus if the r2 values come from the specified distribution, the pp values are simply a random sample from a uniform distribution. Note that effect this does not depend on the parameters of the distribution so that the theoretical distribution of the calculated pp values for all three cases (surface, upper-air, and deep-ocean) would be exactly the same.

        On the other hand, the distribution of the random variable L = g(X), depends on the specific shape of the density function and so will generally differ from distribution to distribution. For example, if X has an Exponential distribution with parameter 1, then g(x) = 1 – G(x) so L would also be uniformly distributed. On the other hand, if the exponential parameter were to be something other than 1, this would no longer be the case.

        The situation with the F distribution is less obvious because the density function is more complicated and the distribution of L would be more difficult to derive explicitly. However, we can use Monte Carlo techniques to get a good picture of what happens:

        Here, I have generated a sample 200000 values from an F(3,200) and calculated both the upper tail probabilities (as done in the program) and the likelihood values using the density function. The histograms of the results would be a reasonably close match to the theoretical distributions. The R script is given below

        I would suggest that the effect of the error could be more substatntial than initially thought and that the only way to determine the magnitude of that effect is to run a complete corrected analysis.

        N = 200000
        df1 = 3
        df2 = 200
        xdata = rf(N,df1,df2)
        xdensity = df(xdata,df1,df2)
        tailprob = 1 - pf(xdata,df1,df2)
        upper = max(1, max(xdensity)+.01)
        par(mfrow = c(2,1))
        hist(tailprob, breaks = seq(0,upper,.01), freq = F, main = "Incorrect Method", xlab = "Tail Probabilities")
        hist(xdensity, breaks = seq(0,upper,.01), freq = F, main = "Correct Method", xlab = "Likelihood Function Values" )
        par(mfrow = c(1,1))

        • NicL
          Posted Nov 9, 2012 at 12:58 PM | Permalink


          Many thanks. I’ll need to think about what you say. I can tell you that in practice I don’t think this error does make a huge difference. I say that because I have tried using {1-CDF} in R instead of the PDF, and the final posterior density plots for the climate system parameters didn’t change much.

    • MikeN
      Posted Nov 9, 2012 at 10:38 AM | Permalink

      You are expecting software app specifications from scientists? What next, program verification?

    • Steven Mosher
      Posted Nov 9, 2012 at 2:43 PM | Permalink

      The writing of code of a science paper does not follow a ‘waterfall’ design process. So, there is no spec, there is no pcode, there is no design review.

      having looked at a bunch of “scientific” or “research” code, you can pretty much tell that it is written on the fly, haha, more like agile dev cause you may have co authors asking you to make changes all along the way and then reviwers can come along and demand changes.

      The best one can hope for is code that runs and produces the figures actually used in the paper. even there you have guys who output datafiles and then use other packages to create graphics.

      One the one hand we would appreciate more clarity but this stuff is not designed for sale or re use.

      That said, some guys do create re usable code, and a super small minority actually create documents and test cases etc.

      As with all things you see a range of talents and flaws. Guys who are really good at the code thing end up not doing science as their principle task. You’ll find them all over the R list creating packages.

      • James Smyth
        Posted Nov 9, 2012 at 7:02 PM | Permalink


        Of course it doesn’t follow waterfall, but that doesn’t mean it shouldn’t be well thought out, organized, factored for re-use and testing, etc.


        You don’t seem to understand this paradigm. The agile people I know are the biggest code fascists I’ve ever met.

        “not designed for or re use.”

        But, it should be. I could imagine that in the context of scientific research, modularity, re-use, etc is even more important than in general software development.

        • Posted Nov 9, 2012 at 7:35 PM | Permalink

          One of the great ironies of the history of software is that the waterfall diagram that was cited by many later papers as some kind of ideal was given as an example of how not to build software systems in the original.

          Tom Gilb loves that story. Tom’s the original agile pioneer in my book, from the 70s – and in the book of Craig Larman, who’s written a history of incremental, iterative development. I picked up Tom’s stuff in 1986 and made the link with pure object languages which seemed very obvious but wasn’t to everyone for another 15 years or so.

          Latter-day developers invoking the word agile have vastly different standards of code purity, in my own experience. Mosh wasn’t being as loose in his usage as all that.

          The main point on which I’d differ with James is that agile for me teaches one to focus totally on the next step – as well as refactoring the code in delivering it. But not on reuse. Clean code is always more reusable but in the current context, of scientific papers, all that matters is that the code does the current job correctly and is in a decent (not perfect) state, because it has been ‘refactored mercilessly’ throughout. (A dictum of Extreme Programming, which was a major component of Agile when it emerged as a buzzword in the early 00s.)

          Being test-driven or test-infected as some say is possible the single most important tenet of agile. But I’m no expert in its application in the scientific context.

        • James Smyth
          Posted Nov 9, 2012 at 8:13 PM | Permalink

          “But I’m no expert in its application in the scientific context.”

          It occurs to me that the code in the post looks like something I might have written in grad school (in Matlab). Narrowly focused on some immediate project, rather than thinking “hey, this is some foundational stuff that I’ll be able to use in my ongoing research for years to come”. The benefits of taking the time to design it correctly not even on my radar.

        • Posted Nov 9, 2012 at 10:53 PM | Permalink

          If you know the future of your own career and discoveries well enough to do this, at any time of life, you are indeed worthy to called genius. The tenets of agile assume far less foreknowledge and that is what makes them useful for ordinary mortals.

        • Eddy
          Posted Nov 10, 2012 at 8:57 AM | Permalink

          James Smyth:
          “You don’t seem to understand this paradigm. The agile people I know are the biggest code fascists I’ve ever met.”
          With the big emphasis on testing in the agile paradigm I can believe that, though I’ve never seen the term ‘code fascists’ before. Certainly made me smile.

        • JFK
          Posted Nov 10, 2012 at 9:20 AM | Permalink

          As a big advocate of agile software development, I must point out that agile is not cowboy coding with no documentation and no spec, to the contrary most forms of it are quite rigorous. Don’t give agile a bad name by associating it with crappy code, no coding standards, and no spec, that’s not what it is.

      • MrPete
        Posted Nov 9, 2012 at 11:44 PM | Permalink

        Re: Steven Mosher (Nov 9 14:43),

        This whole thing mostly reminds me of Kernighan’s Law (one of the coauthors of the C Language):

        “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”

        These folks are writing dense, “clever” code, with little if any documentation. No wonder it’s buggy.

        • Posted Nov 10, 2012 at 1:39 AM | Permalink

          +1. Or should that be ++?

        • Posted Nov 10, 2012 at 2:49 AM | Permalink

          A major issue in dense clever code written for the moment so to speak is that if one takes a ‘breather’ for too long; remembering exactly what one was trying to do in clever dense (reversed on purpose) code is difficult.

          A long breather, say a couple of months spent in intensive study of detail results and planning presentations and it becomes very difficult to ‘recollect’ why and what code sections are doing.

          Coders who just comment out lines without a note or hint why are lazy or purposely obtuse. Perhaps there was never an intention to return to the code for generic replication. Meaning that notes may be available separately for some, but not all. Makes m wat to try the 2GB file just to look for notes.

          Caveats aside, the code listed above looks like it is a re-worked code from an earlier module that was slightly better at adhering to standards, (header and obvious info documentation versus some of the more intensive iteration modules).

          My apologies. I neither know nor own IDL and my simple analysis mutterings here are based on languages I do know and have spent some time immersed in. The code bit above looks well written to me even if it is not following a planned design. Trouble is, there is a huge difference between well written, (as in achieve the writers desires) and subsequently decipherable by anyone without total immersion.

        • MrPete
          Posted Nov 10, 2012 at 10:45 AM | Permalink

          Re: atheok,

          Trouble is, there is a huge difference between well written, (as in achieve the writers desires) and subsequently decipherable by anyone without total immersion.

          And the person who most likely will have difficulty with subsequent deciphering… is the original author.

          People forget that they themselves will have trouble remembering what they were doing, given a year or two delay. 😦

        • Posted Nov 10, 2012 at 11:54 AM | Permalink


          And the person who most likely will have difficulty with subsequent deciphering… is the original author.

          Because they’re the only one that are ever going to be interested or willing? 🙂 One of the more radical Extreme Programming techniques, being used more than I expected in London’s ‘Silicon Roundabout’ startups at present, is pair programming. This means that there are two original authors to be dragged back to the source but, far more important, it makes readability of the code front and centre at the moment of creation.

          Pairing with an Oxford physicist over Twitter is however new to me. It looks as if we may have cracked it. (I was mostly a rubber dummy for JJ to bounce his thoughts off. But it’s amazing how often that helps.)

        • Posted Nov 10, 2012 at 2:28 PM | Permalink

          Mr. Pete: You’re right on target!

          Richard Drake; “…Because they’re the only one that are ever going to be interested or willing?…”

          Maybe yes, maybe no. I’d often wondered why my own code was sometimes so hard for me to follow after a long period away, even with comments and notes. A lot of the code I worked on ended up in a public (to the company) menu system that I allowed free downloads from for people who wanted to to customize their own modules.

          Programmers learn a multiplicity of styles as they get better. It is not hard for a programmer to see something original and quickly grasp the logic which they then wrap into their own code. I still find myself watching people’s screens to see how presentation, input and system commands are handled…

          When returning to code written some time ago, I think part of the problem is not recognizing or perhaps it is detesting code structure ‘improved’ upon since writing it. That old piece of knowledge that we find most repugnant those self traits of ourselves we dislike, that we see in others.

          So it is harder to ‘accept’ that we ever practised such barbaric coding as is evidenced in our own old code right there for everyone to see. If it’s really bad code, you can actually blush when you recognize it.

          Programmers credo “plagiarize freely, claim nothing.”

          Glad to hear you’re on track to cracking the code, JJ and Richard!

      • Steven Mosher
        Posted Nov 11, 2012 at 3:30 AM | Permalink

        I didnt say it was agile, not by a long shot.

        please re reread and pay attention to the specific frame of reference and the qualifiers.

        • James Smyth
          Posted Nov 12, 2012 at 4:35 PM | Permalink

          “qualifiers” oh, I see, other scientific code which (hey, just like this code) looks like crap, is “more like agile”. LOL. Yeah, sorry, I didn’t read you close enough.

        • Steven Mosher
          Posted Nov 14, 2012 at 11:39 PM | Permalink

          No the frame of reference is the requirements process.

        • James Smyth
          Posted Nov 20, 2012 at 4:59 PM | Permalink

          This is way past the point of pettiness that I would usually care about, but I stand by my original point that whoever originally said “having looked at a bunch of ‘scientific’ or ‘research’ code, you can pretty much tell that it is written on the fly, haha, more like agile dev cause you may have co authors asking you to make changes all along the way and then reviwers can come along and demand changes.” has (or had) no idea what Agile development is about. Maybe by now you’ve educated yourself enough to know that you could have used something like “extreme programming” rather than “agile dev” and gotten it by people who don’t know any better. Agile development is in fact quite adamant that reviewers can’t just some along and demand changes. It is precisely the well-defined, albeit short, cycle which defines Agile.

    • JFK
      Posted Nov 10, 2012 at 9:15 AM | Permalink

      Old business software developer? Sounds more like an old BA or project manager. Developers know that code rarely functions exactly as intended, and specs are never 100% complete. I’m sure a spec would say to use the MLE , you’d have to review the code to see that the ML is not computed correctly.

  20. DaveA
    Posted Nov 9, 2012 at 9:41 AM | Permalink

    IDL is expensive, what about our bloody carbon tax and Department of Climate Change! Can one of you bureaucrats fly economy instead of business just once so Nik can sort out this problem at the front line?

  21. EdeF
    Posted Nov 9, 2012 at 9:47 AM | Permalink

    Probably takes 6 mo to a year to be reasonably comfortable with a new language.
    The key here is that if IDL, GDL experts can be found, a qualified stats guy is already there to lead them.

  22. Posted Nov 9, 2012 at 11:24 AM | Permalink

    I am in discussions with NicL at the moment, and am close to getting the code working here.

  23. Posted Nov 9, 2012 at 11:58 AM | Permalink

    The IDL code is clearly written for use on some sort of unix system, and unless I’m doing something silly is not turnkey on Windows 7 which I am using. But it doesn’t look too hard to hack; just a bit tedious.

    • Steven Mosher
      Posted Nov 9, 2012 at 2:28 PM | Permalink

      looks like we need to equip you for dual boot. hehe

      • Posted Nov 9, 2012 at 2:42 PM | Permalink

        I was somewhat surprised when I moved from chemistry to physics to find that I was also moving from unix to Windows. I do have a PC running Linux as well, but I hardly ever use it. My days as a Solaris sysadmin are mostly over, though I still run one legacy system.

        • Steven Mosher
          Posted Nov 9, 2012 at 3:08 PM | Permalink

          I’ve been forced back into Linux by some of the remote sensing stuff I do and now my dual boot Mac is acting up. Im pretty sure I screwed it up royally. My options are to screw up my windows system or buy a Linux box.

        • James Smyth
          Posted Nov 9, 2012 at 7:03 PM | Permalink

          Have you tried virtual environments?

  24. Steven Mosher
    Posted Nov 9, 2012 at 3:01 PM | Permalink

    I will add one oddity here that perhaps Miles can address if he wishes to. Early on NicL was kind enough to share his paper on Forest ( just prior to the submission deadline ). My head hurt.
    Now as a SOD reviewer I thought I would see how the paper was handled. Hmm. without revealing details all I can say is that the paper got “tabled.” Tabling is a maneuver you employee to say something akin to “that’s interesting, can we move on.”
    Now Nick has a programming error to expose. And I would characterize the approach taken as “tabling”

    Isnt this an opportunity for the lead authors to call a conference to discuss the issue openly rather than tabling Nic’s contributions?

    Just asking. in my mind its similar to the whole briffa affair. where you have some serious questions that should be addressed openly in a symposium of those concerned.

    • Skiphil
      Posted Nov 9, 2012 at 3:19 PM | Permalink

      Re: Steven Mosher (Nov 9 15:01),

      an aside on usage: in the USA to “table” a motion or paper, article etc. means to put it aside for non-consideration, to “shelve” it and ignore it

      but in the UK and parliamentary traditions to “table” a motion means to bring it forward for discussion and debate, just about an opposite meaning

      two very different tables with differing purposes

      so non-US readers might find our American usage confusing, although presumably they will figure it out from the context

      • theduke
        Posted Nov 9, 2012 at 3:54 PM | Permalink

        Interesting. I ran into the exact same issue with a new member of a local governing board I serve on here in Southern California. When he used the word “table,” he meant to bring it up for discussion. I’d never heard it used that way. It may be that he or his parents are transplanted Brits or something. He also uses the word, “chaps” to mean “guys” or “fellows.” No detectable accent, however.

        • Steven Mosher
          Posted Nov 9, 2012 at 5:26 PM | Permalink

          most likely. the first time I tried to “table” something with a brit he insisted on talking about it. doh!

          If you really want to puzzle them say “lets table that and get back to it later” circuit fry.

        • AndyL
          Posted Nov 9, 2012 at 6:44 PM | Permalink

          to really confuse a Brit, offer to table something momentarily. He will expect a very short discussion

        • Reference
          Posted Dec 9, 2012 at 5:46 AM | Permalink

          England and America are two countries separated by a common language.
          George Bernard Shaw

    • Posted Nov 9, 2012 at 3:23 PM | Permalink

      I can’t think of a better example of the need for openness of the IPCC in its 24 years, from my admittedly limited knowledge. Thanks for sharing what you can Steve. (And I did get the meaning of ‘table’ Skip, though well worth mentioning.)

  25. NicL
    Posted Nov 9, 2012 at 5:28 PM | Permalink

    I rather liked this description at the start of the program “detect” in the Forest 2006 code, which I think is probably also a key module in some of the main detection and attribution studies:

    ; Given a set of vectors of observations containing no missing data,
    ; a set of similar vectors from model simulations with changing forcing and a
    ; control simulation, computes confidence intervals on undetermined model
    ; parameters by multiple linear regression.
    ; INPUT:
    ; 1) OBS(imx,ndobs)=observations
    ; 2) CHF(imx,nensm,nexpt)=runs with changing forcing: nexpt ensembles each
    ; with nensm runs in each ensemble element
    ; 3) CT1(imx,ndct1)=control run for estimating pre-whitening transformation
    ; 4) CT2(imx,ndct2)=control run for hypothesis-test, only used if twoest set
    ; 5) WGT(imx)=vector of weights
    ; 6) neofc=number of EOFs of the control run to retain in the noise model
    ; must be less than or equal to the smaller of imx and ndct1.
    ; Ideally, ndct1 >> imx (like, in your dreams), and neofc=imx.
    ; In reality… there isn’t much to guide us here.

    It reminds me a bit of the Harry Readme file

    To be fair, I may be taking this out of context. I haven’t yet parsed what exactly this program is used for in Forest 2006. The warning comments may be irrelevant there.

  26. Pat Frank
    Posted Nov 9, 2012 at 9:26 PM | Permalink

    Nic: “The most popular observationally-constrained method of estimating climate sensitivity involves comparing data whose relation to S is too complex to permit direct estimation, such as temperatures over a spatio-temporal grid, with simulations thereof by a simplified climate model that has adjustable parameters for setting S and other key climate properties. The model is run at many different parameter combinations; the likelihood of each being the true combination is then assessed from the resulting discrepancies between the modelled and observed temperatures. This procedure requires estimates of the natural spatio-temporal covariability of the observations, which are usually derived from AO-GCM control runs, employing an optimal fingerprinting approach. A Bayesian approach is then used to infer an estimated probability density function (PDF) for climate sensitivity.

    Everything that’s wrong with climate science, as science.

    • NicL
      Posted Nov 10, 2012 at 9:57 AM | Permalink


      The wording in Appendix 9.B of AR4 WG1 that I based this on actually started:

      “Recent studies estimating climate sensitivity from observations can all be cast in terms of an approach that is closely related to climate change detection methods.”

      – as is evident from the re-use of detection & attribution code in Forest 2006.

      So various of the shortcomings of such climate sensitivity studies that you allude to also apply to many detection and attribution studies, as you no doubt appreciate, although problems with biased Bayesian inference apply only to such climate sensitivity studies.

      I thought your comments on the FOD of Chapter 10 of AR5 WG1 were very much to the point, BTW.

      • NicL
        Posted Nov 10, 2012 at 10:20 AM | Permalink

        Sorry, I was getting mixed up. On reflection, I think that the FOD comments I referred to were actually by Fred Pearce.

      • Pat Frank
        Posted Nov 10, 2012 at 4:10 PM | Permalink

        Hi Nic, I didn’t intend that the quoted description should be ascribed to you. It was clear you were quoting from AR4 and I should have specified that.

        That said, the whole use-of-models-to-test-models methodology seems fatally self-referential to me.

        • Gerald Browning
          Posted Nov 11, 2012 at 12:12 AM | Permalink


          My original reply seems not to have been posted.

          Are we the only scientists that can see how silly it is to argue about statistics on climate sensitivity
          being computed using a climate model that does not even remotely accurately describe the dynamics and physics of the atmosphere?

          It is up to the climate “scientists” to prove that
          their climate model is accurate and there is no mathematical or numerical analytic theory that can be used to do so.

        • johnfpittman
          Posted Nov 11, 2012 at 7:03 AM | Permalink

          No. Craig and I were with you two over at RC when Gavin was taking you on, and finally just shut the discussion down. Over at Dr. Curry is this:

          From the conclusions of: Simulating regime structures in weather and climate prediction models

          A. Dawson, T.N. Palmer, S. Corti

          Understanding gained from studies of low-dimensional dynamical systems suggests that the response to external forcing of a system with regimes is manifested primarily in changes to the frequency of occurrence of those regimes. This implies that a realistic simulation of regimes should be an important requirement from climate models. We have shown that a low resolution atmospheric model, with horizontal resolution typical of CMIP5 models, is not capable of simulating the statistically significant regimes seen in reanalysis, yet a higher resolution configuration of the same model simulates regimes realistically. This result suggests that current projections of regional climate change may be questionable.This finding is also highly relevant to regional climate modelling studies where lower resolution global atmospheric models are often used as the driving model for high resolution regional models. If these lower resolution driving models do not have enough resolution to realistically simulate regimes, then then boundary conditions provided to the regional climate model could be systematically erroneous. It is therefore likely that the embedded regional model may represent an unrealistic realization of regional climate and variability.

          Dr. Browning I find the last sentence objectionable wrt your work. I thought your work demonstrated that such large mesh size, small dimensional approaches cannot represent regional climate and variability, not MAY NOT.

        • Pat Frank
          Posted Nov 11, 2012 at 2:07 PM | Permalink

          Jerry, I don’t understand it.

        • Gerald Browning
          Posted Nov 11, 2012 at 6:50 PM | Permalink

          John and Pat

          It is completely hilarious what meteorologists and climate scientists claim about limited area forecast and climate models. In the case of limited area forecast models, the boundary conditions generated by a global forecast model that does not allow mesoscale storms and large scale gravity waves, the inflow boundary condition information needed to accurately depict mesoscale storm mechanisms in the limited area is inaccurate. In addition, because of observational data and erroneous parameterization errors in the global model, the large scale boundary data is inaccurate and that error impacts large areas of the limited area domain. This can most easily be seen by considering that the slowly evolving large scale flow satisfies elliptic equations for all variables except the vertical component of the vorticity. Large scale errors on the boundaries lead to large scale errors in the interior for solutions of such elliptic equations (reference depicting this problem available on request).
          In addition, the physical 3d latent heating initial data in the limited area must be accurate as that data has an instantaneous impact on any mesoscale storm because the vertical velocity is directly proportional to that heating.
          As a side note, if the global forecast model uses the hydrostatic assumption and tries to accurately model mesoscale storms, the ill posedness of the hydrostatic equations will reveal itself.

          All of this has been swept under the rug.


  27. Posted Nov 10, 2012 at 3:19 PM | Permalink

    Thanks Richard, the conversation was indeed helpful (even though, I’m afraid, I actually ended up installing astrogrep rather than using your solution…).

    The code is now up and running in Oxford under Windows 7, and I understand somebody has got it working in Austria under unix. So we can start making changes once Nic decides exactly what he wants to do.

    The 2GB is mostly data files; the actual code isn’t huge. The code strikes me as typical academic code: it has been bodged together out of a group of underlying modules with odd bits commented out or diabled by flags. It looks to me like this turnkey version is in fact adapted from an earlier version which was even more hard coded for one particular environment. Not as user hostile as some of the things I write though.

    • Posted Nov 10, 2012 at 4:18 PM | Permalink

      Ah well – grep was always best because you know it. I’d say it’s very useful to have multiple working versions in different environments. Maximum verification as my company used to say in our guide to agile in days of yore. All the best to Nic in taking this forward.

  28. stacyglen
    Posted Nov 10, 2012 at 5:49 PM | Permalink

    The single user license for a single person – not company, limited to a single host is 4200 Euros. I registered and checked today. This is a lot of money for a individual to find. But this is a modest amount of money for any well funded Carbon Conspiracy to find – if they existed.

  29. Patagon
    Posted Nov 11, 2012 at 5:17 PM | Permalink

    Forest (2006) states:
    “It is not possible to estimate the true climate system variability on century time- scales from observations and therefore, climate models are run with fixed boundary conditions for thousands of years to obtain estimates of the climate variability.”

    So, strictly speaking, they are detecting the sensitivity of the model, not that of the climate system.

    The model used is the MIT 2D statistical-dynamical climate model [Sokolov and Stone, 1998].

    This is a bit old, I would say, but it is interesting to look at table 3 of Sokolov and Stone (1998), where the MIT 2D surface energy budget is compared to other models. If we take for example NCAR, MIT 2D has a decrease of 3.6 (M/m^2 ?) for a doubling of C02 in latent heat turbulent transfer and an increase of 3.3 in shortwave radiation. These differences are of the same magnitude than the radiative forcing due to CO2, and therefore the error bars make the detection of CO2 forcing rather unlikely.

    Nonetheless, it would be interesting to see if the statistical error makes also an additional and large difference (that would require something like 3D error bars).

    • Paul_K
      Posted Nov 12, 2012 at 10:37 PM | Permalink

      Have a look at Figure 9.3 of AR4, read the supporting text and see what conclusions you come to.

      • Patagon
        Posted Nov 13, 2012 at 2:53 AM | Permalink


        In that case differences go up to 6 W/m^2. I would like to see a comparison of the values for actual fluxes in every model, that would be telling. But even for each model individually, differences from one version to another are larger that the magnitude whose effect they are set to detect. As I pointed out at the Bishop, a recent update of the Met Office model shows changes of up to 15 W/m^2 due to “improvements” in albedo and emissivity.

        At the end is like pretending you can detect a milligram change in weight using a balance with a precision of one kg, you only need to use many balances, model the measurements for a hundred years and show a mean ensemble of the results without error bars and a very obtuse wording.

  30. Skiphil
    Posted Nov 11, 2012 at 10:53 PM | Permalink

    This is a good time to read or recall an “oldie but goodie” — one of the seminal posts on the lack of due diligence in too much of climate science:

  31. Geoff Sherrington
    Posted Nov 12, 2012 at 5:24 AM | Permalink

    Australia was a lonely place for early adopters of computers in the 1970s. There was no Internet of supporters, no FAQ lists, there were rudimentary languages and many cups of coffee. I did call myself a programmer, because armed with a Data General Nova and a two week course in machine language, I was set loose to do my lonesome best. Then came a DEC PDP8 and Basic, but there was no mass storge facility and Basic took up a lot of the few kilobits of memory. Factors like this led to an economy of programming style, which was thankfully subsumed when I joined a large corporation with a beaut group of intelligent, dedicated programmers and a HP 2100. It was so much easier for me to specify then reject, than it was to write my own to reject. I was in and out of computing over the decades, but saw a lot from fairly close without being hands on.

    There was a philosophic change to the purpose of programming in science. Early on, people wrote to do large calculations that could be replicated and varied with near-predictable outcomes. There was virtally no ‘exploratory programming’ that I was exposed to, whereby one followed a trail of code like a dingo following the droppings of a sick sheep, until the sheep died and all was forgotten except for some extended survival time of paid programmers.

    These days I think of problems of this complexity: You are in an early nuclear submarine that has been underwater for 100 days, straining your navigation system. You receive a command to fire a Polaris missile to a given target. Who would like to program the various parameters that have to be accounted for, like the slow but persistent up and down of tides, the difficulty of fixing a reliable reference point, spherical geometry, Earth shapes, collisions with whales, the presence of inversions on comunications of test signals etc etc.

    That’s my simple understanding of what we do now, akin to weather movements in some ways. We seem to end up with a series of modules that can sometimes (and cannot sometimes) be integrated into a larger framework. In the worst case, nobody can later remember the purpose of a module or its initial assumptions or the coefficients used with it.

    Maybe this is why many of us passed through a few years of avidly solving Martin Gardner’s monthly puzzles in ‘Scientific American’. They were much more than puzzles, they were teachings on how to analyse a problem and to structure its solution. There was useless information thrown in to mislead the reader, there were problems that highly intelligent people left unsolved after weeks, when the answer could be arrived at by a different path in a few minutes in the head.

    Maybe, with my inexperience of hands on, I’m being unkind with these comments, but my intention was more to show that the calculator became a computer but the computer has not yet become a logical thinker. Too often, it sits working on a problem ‘just because it can’ when that problem could be solved without a computer in a few minutes.

    Here’s what I mean

  32. Alexander K
    Posted Nov 12, 2012 at 5:57 PM | Permalink

    Excellent point and well put, Geoff.
    I worked in a university department about forty years ago which, like every department there at that time, relied heavily on the campus’ mainframe computer to process the information our academics fed into the thing via cards punched by the staff at the central computer facility. I was instrumental in our department purchasing and installing a ‘programmable calculator’ from Wang, which used a library of programmes, each stored on what appeared to be a standard audio cassette, to take all the time-consuming drudgery of punching cards out of crunching big numbers as the parameters for each task could be entered from the calculator’s querty keyboard, saving immense amounts of time and cost.
    My office, where the device was located, soon became a popular visiting place for scientists from other departments who found that the machine could do in a few minutes tasks which took a week to get through the mainframe computer.
    ‘Programmable calculators’ began to proliferate on the campus but this proliferation was soon halted by the arrival of the first-generation IBM PC on site.
    Since that time, it has become unfashionable to ‘think a problem through’. The scientific world appears to be in the thrall of statistical manipulation via computers, when often as your example shows, the human brain can be a powerful tool.

  33. Posted Nov 14, 2012 at 10:48 AM | Permalink

    It’s hard to believe that IDL would name a subroutine that computes that F CDF “f_pdf”, but that is indeed the case:

    I don’t follow how Forest was using this function, but if indeed he wanted the PDF to compute a likelihood, the study should be redone correctly.

    • Hugo M
      Posted Nov 15, 2012 at 2:19 AM | Permalink

      Hu, Forest also uses fcdf in

      if (keyword_set(ctlind)) then begin
      if (keyword_set(pv_wave)) then $
      Presid(k_t)=1.-fcdf(rssq_k/(ndofd-nscn),ndofd-nscn,ndofn) else $
      endif else begin

      However, has an empty function body:


      If fcdf is ever called is difficult to say because I hitherto did not succeed in running the program under GDL.

    • NicL
      Posted Nov 22, 2012 at 12:57 PM | Permalink

      Hu, thanks for your comment. Forest was certainly using the “f_pdf” function, or rather “1 – f_pdf”, to compute a likelihood. And I have found some more statistical “funnies” in the code since writing the above article.

      Hugo, I think that in the f_pdf function is being used, correctly, to get the F-distribution CDF, for a different purpose. I don’t know what the “fcdf” function is there for, but it looks as if it may be something to do with use of PV WAVE, a closley related program to IDL.

  34. Skiphil
    Posted Dec 19, 2012 at 6:10 PM | Permalink

    Don’t miss current discussions on further work by Nic Lewis on climate sensitivity and the IPCC estimates. Threads now at Bishop Hill, WUWT, and Climate Etc. Kudos to Nic Lewis and all who have contributed to this progress.

  35. Posted Jan 16, 2013 at 9:22 PM | Permalink

    The 2GB file of code and data at is no
    longer accessible. Nothing is accessible at the revised URL from the comments either (

    Could someone, who downloaded the file already, make a copy publicly available over the long term? (I am happy to host it long term on my own website if someone wants to get me a copy.)

    • Hugo M
      Posted Jan 17, 2013 at 6:35 AM | Permalink


      a file of identical size and md5sum (983d3b3ec00c192a9abe32aef82cc1f3) can be found at h ttp:// . The directory contains more files, possibly also a copy of the lost raw data?

       [TXT]  GRL06_plotfilelist.txt  10-Dec-2012 17:43    12K   
       [   ]  GRL06_rawdata.tgz       10-Dec-2012 17:33   8.4G   
       [   ]  GRL06_reproduce.tgz     01-Aug-2012 11:10   2.0G   
       [   ]  GRL06_reproduce2.tgz    03-Aug-2012 17:33   1.2G   
       [   ]  GRL11_reproduce.tgz     13-Nov-2012 16:53    98M   
       [DIR]  GSOLSV/                 03-Jun-2010 13:57    -    
       [   ]  curry_z4_ctl.tar        08-Aug-2012 15:17    60K   

5 Trackbacks

  1. […] […]

  2. By The Climate Change Debate Thread - Page 1745 on Nov 12, 2012 at 5:46 AM

    […] […]

  3. […] in Figure 9.21 of AR4 WG1: Forest 06 and Knutti 02. Forest 06 has several statistical errors (see here) and other problems. Knutti 02 used a weak statistical procedure and an arbitrary combination of […]

  4. […] Forest 2006 likelihood functions. I wrote a detailed article about these errors at Climate Audit, here. But it appears that even in combination with the surface diagnostic 9 month timing mismatch these […]

  5. […] basic statistical errors in the F06 code, about which I wrote a detailed article at Climate Audit, here. But those errors could not have affected any unrelated studies.  In this post, I want to focus on […]

%d bloggers like this: