Marvel et al.: GISS did omit land use forcing

A guest article by Nic Lewis

I reported in a previous post, here, a number of serious problems that I had identified in Marvel et al. (2015): Implications for climate sensitivity from the response to individual forcings. This Nature Climate Change paper concluded, based purely on simulations by the GISS-E2-R climate model, that estimates of the transient climate response (TCR) and equilibrium climate sensitivity (ECS) based on observations over the historical period (~1850 to recent times) were biased low.

I followed up my first article with an update that concentrated on land use change (LU) forcing. Inter alia, I presented regression results that strongly suggested the Historical simulation forcing (iRF) time series used in Marvel et al. omitted LU forcing. Gavin Schmidt of GISS responded on RealClimate, writing:

“Lewis in subsequent comments has claimed without evidence that land use was not properly included in our historical runs…. These are simply post hoc justifications for not wanting to accept the results.”

In fact, not only had I presented strong evidence that the Historical iRF values omitted LU forcing, but I had concluded:

“I really don’t know what the explanation is for the apparently missing Land use forcing. Hopefully GISS, who alone have all the necessary information, may be able to provide enlightenment.”

When I responded to the RealClimate article, here, I inter alia presented further evidence that LU forcing hadn’t been included in the computed value of the total forcing applied in the Historical simulation: there was virtually no trace of LU forcing in the spatial pattern for Historical forcing. I wasn’t suggesting that LU forcing had been omitted from the forcings applied during the Historical simulations, but rather that it had not been included when measuring them.

Yesterday, a climate scientist friend drew my attention to a correction notice published by Nature Climate Change, reading as follows:

Corrected online 10 March 2016

In the version of this Letter originally published online, there was an error in the definition of F2×CO2 in equation (2). The historical instantaneous radiative forcing time series was also updated to reflect land use change, which was inadvertently excluded from the forcing originally calculated from ref. 22. This has resulted in minor changes to data in Figs 1 and 2, as well as in the corresponding main text and Supplementary Information. In addition, the end of the paragraph beginning’ Scaling ΔF for each of the single-forcing runs…’ should have read ‘…the CO2-only runs’ (not ‘GHG-only runs’). The conclusions of the Letter are not affected by these changes. All errors have been corrected in all versions of the Letter. The authors thank Nic Lewis for his careful reading of the original manuscript that resulted in the identification of these errors.” Continue reading

Bayesian parameter estimation: Radiocarbon dating re-examined

A guest article by Nic Lewis


In April 2014 I published a guest article about statistical methods applicable to radiocarbon dating, which criticised existing Bayesian approaches to the problem. A standard – subjective Bayesian – method of inference about the true calendar age of a single artefact from a radiocarbon date determination (measurement) involved using a uniform-in-calendar-age prior. I argued that this did not, as claimed, equate to not including anything but the radiocarbon dating information, and was not a scientifically sound method for inference about isolated examples of artefacts.[1]

My article attracted many comments, not all agreeing with my arguments. This article follows up and expands on points in my original article, and discusses objections raised.

First, a brief recap. Radiocarbon dating involves determining the radiocarbon age of (a sample from) an artefact and then converting that determination to an estimate of the true calendar age t, using a highly nonlinear calibration curve. It is this nonlinearity that causes the difficulties I focussed on. Both the radiocarbon determination and the calibration curve are uncertain, but errors in them are random and in practice can be combined. A calibration program is used to derive estimated calendar age probability density functions (PDFs) and uncertainty ranges from a radiocarbon determination.

The standard calibration program OxCal that I concentrated on uses a subjective Bayesian method with a prior that is uniform over the entire calibration period, where a single artefact is involved. Calendar age uncertainty ranges for an artefact whose radiocarbon age is determined (subject to measurement error) can be derived from the resulting posterior PDFs. They can be constructed either from one-sided credible intervals (finding the values at which the cumulative distribution function (CDF) – the integral of the PDF – reaches the two uncertainty bound probabilities), or from highest probability density (HPD) regions containing the total probability in the uncertainty range.

In the subjective Bayesian paradigm, probability represents a purely personal degree of belief. That belief should reflect existing knowledge, updated by new observational data. However, even if that body of knowledge is common to two people, their probability evaluations are not required to agree,[2] and may for neither of them properly reflect the knowledge on which they are based. I do not regard this as a satisfactory paradigm for scientific inference.

I advocated taking instead an objective Bayesian approach, based on using a computed “noninformative prior” rather than a uniform prior. I used as my criterion for judging the two methods how well they performed upon repeated use, hypothetical or real, in relation to single artefacts. In other words, when estimating the value of a fixed but unknown parameter and giving uncertainty ranges for its value, how accurately would the actual proportions of cases in which the true value lies within each given range correspond to the indicated proportion of cases? That is to say, how good is the “probability matching” (frequentist coverage) of the method. I also examined use of the non-Bayesian signed root log-likelihood ratio (SRLR) method, judging it by the same criterion. Continue reading

Gerry Browning: In Memory of Professor Heinz Kreiss

Gerry Browning writes:

The Correct System of Equations for Climate and Weather Models

The system of equations numerically approximated by both weather and climate models is called the hydrostatic system. Using a scale analysis for mid-latitude large scale motions in the atmosphere (motions with a horizontal length scale of 1000 km and time scale of a day), Charney (1948) showed that hydrostatic balance, i.e., balance between the vertical pressure gradient and gravitational force, is satisfied to a high degree of accuracy by these motions. As the fine balance between these terms was difficult to calculate numerically and to remove fast vertically propagating sound waves to allow for numerical integration using a larger time step, he introduced the hydrostatic system that assumes exact balance between the vertical pressure gradient and the gravitational force. This system leads to a columnar (function of altitude) equation for the vertical velocity called Richardson’s equation.

A scale analysis of the equations of atmospheric motion assumes that the motion will retain those characteristics for the period of time indicated by the choice of the time scale (Browning and Kreiss, 1986). This means that the initial data must be smooth (have spatial derivatives on the order of 1000 km) that lead to time derivatives on the order of a day. To satisfy the latter constraint, the initial data must satisfy the elliptic constraints determined by ensuring a number of time derivatives are of the order of a day. If all of these conditions are satisfied, then the solution can be ensured to evolve smoothly, i.e., on the spatial and time scales used in the scale analysis. This latter mathematical theory for hyperbolic systems is called “The Bounded Derivative Theory” (BDT) and was introduced by Professor Kreiss (Kreiss, 1979, 1980).

Instead of assuming exact hydrostatic balance (leads to a number of mathematical problems discussed below), Browning and Kreiss (1986) introduced the idea of slowing down the vertically propagating waves instead of removing them completely, thus retaining the desirable mathematical property of hyperbolicity of the unmodified system. This modification was proved mathematically to accurately describe the large scale motions of interest and, subsequently, also to describe smaller scales of motion in the mid-latitudes (Browning and Kreiss, 2002). In this manuscript, the correct elliptic constraints to ensure smoothly evolving solutions are derived. In particular the elliptic equation for the vertical velocity is three dimensional, i.e., not columnar, and the horizontal divergence must be derived from the vertical velocity in order to ensure a smoothly evolving solution.

It is now possible to see why the hydrostatic system is not the correct reduced system (the system that correctly describes the smoothly evolving solution to a first degree of approximation). The columnar vertical velocity equation (Richardson’s equation) leads to columnar heating that is not spatially smooth. This is called rough forcing and leads to the physically unrealistic generation of large amounts of energy in the highest wave numbers of a model (Browning and Kreiss, 1994; Page, Fillion, and Zwack, 2007). This energy requires large amounts of nonphysical numerical dissipation in order to keep the model from becoming unstable, i.e., blowing up. We also mention that the boundary layer
interacts very differently with a three dimensional elliptic equation for the vertical velocity than with a columnar equation (Gravel, Browning, and Kreiss).

Browning, G. L., and H.-O. Kreiss 1986: Scaling and computation of smooth atmospheric motions. Tellus, 38A, 295–313.
——, and ——, 1994: The impact of rough forcing on systems with multiple time scales. J. Atmos. Sci., 51, 369-383
——, and ——, 2002: Multiscale bounded derivative initialization for an arbitrary domain. J. Atmos. Sci., 59, 1680-1696.
Charney, J. G., 1948: On the scale of atmospheric motions. Geofys.Publ., 17, 1–17.
Kreiss, H.-O., 1979: Problems with different time scales for ordinary differential equations. SIAM J. Num. Anal., 16, 980–998.
——, 1980: Problems with different time scales for partial differential equations. Commun. Pure Appl. Math, 33, 399–440.
Gravel, Sylvie et al.: The relative contributions of data sources and forcing components to the large-scale forecast accuracy of an operational model. This web site
Page, Christian, Luc Fillion, and Peter Zwack, 2007: Diagnosing summertime mesoscale vertical motion: implications for atmospheric data assimilation. Monthly Weather Review, 135, 2076-2094.

Disappearing the MWP at Icefields, Alberta

In today’s post, I’m going to critically examine another widely used tree ring chronology: the Icefields (Alberta) MXD RCS chronology of Luckman and Wilson (2005 pdf), used most recently in Wilson et al 2016.

I’ll show that the RCS technique used in the LW2005 MXD chronology eliminated high medieval values as a tautology of their method, not as a property of the data and that the Icefields data provides equal (or greater) justification for MXD RCS chronologies with elevated medieval values.  The measure of potential difference is previewed in Figure 1 below, which compares the reported LW2005 chronology (top panel) to an MXD chronology with elevated medieval values, calculated using equally (or more plausible) RCS methodology, with several other similar variations shown in the post.


Figure 1.  Top – MXD RCS chronology from Luckman and Wilson 2005 Figure 2; bottom – MXD RCS chronology calculated under alternative assumptions – see Figure 3 below. 

I will use random effects statistical techniques both to analyse the data and to place prior analysis in a more formal statistical context.  Because LW2005 coauthor Rob Wilson stands alone for civility in the paleoclimate world and because the present post is critical of past analysis, in some ways, I would have preferred to use another example. However, on the other hand, it is also possible that Rob will recognize that the techniques applied in the present post – techniques that are unfamiliar to dendros –  can yield fresh and important insight into the data and will look at the Icefields data a little differently.

Although the article in consideration was published more than a decade ago, the analysis in today’s article was impossible until relatively recently, because coauthor Luckman withheld the relevant data for over a decade.

Continue reading

New Light on Gulf of Alaska

Last week, I posted on the effect of ex post site selection on the Gulf of Alaska tree ring chronology used in Wilson et al 2016 (from Wiles et al 2014).  An earlier incarnation of this chronology (in D’Arrigo et al 2016) had had a severe divergence problem, a problem that Wiles et al had purported to mitigate. However, their claimed mitigation depended on ex post selection of modern sites that were 800 km away from the original selection.

Even with this ex post selection, I could not replicate the supposed mitigation of the divergence problem, because Wiles et al had not archived all of their data: it appeared that their ex post result depended on still unarchived data.  Subsequent to my post,  Greg Wiles commendably archived the previously missing Wright Mountain (but, unfortunately, neglected to archive the update to Eyak Mountain leaving the data less incomplete but still incomplete.)

The new data confirms my suspicion that the “missing” Wright Mountain data was inhomogeneous: it turns out that the average Wright Mountain RW was 21% higher than the average RW value from other modern sites.  Because Wiles et al used RCS with a single regional curve, the inclusion of this inhomogeneous data results in higher recent values.  By itself, the Wright Mountain data doesn’t actually go up in the last half of the 20th century, but inclusion of the inhomogeneous data translates the chronology upwards relative to subfossil data.   In the Calvinball world of RCS chronologies, the handling of inhomogeneous data is determined after the fact, with RCS chronologers seeming to be extremely alert to inhomogeneities that yield high medieval values (e.g. Polar Urals), but rather obtuse to inhomogeneities that yield high modern values, with decisions on site inhomogeneity always being made ex post. All too often, medieval-modern comparisons rest on Calvinball decisions, rather than the integrity of the data.

In today’s post, I’ll use the random effects techniques (consistently recommended at CA) to try to provide a structured consideration of this example.  Continue reading

Marvel et al. – Gavin Schmidt admits key error but disputes everything else

A guest article by Nicholas Lewis


Gavin Schmidt has finally provided, at the GISS website, the iRF and ERF forcing values for a doubling of CO2 (F2xCO2) in GISS-E2-R, and related to this has made wholesale corrections to the results of Marvel et al. 2015 (MEA15). He has coupled this with a criticism at RealClimate of my appraisal of MEA15, writing about it “As is usual when people try too hard to delegitimise an approach or a paper, the criticisms tend to be a rag-bag of conceptual points, trivialities and, often, confused mis-readings – so it proves in this case”. Personally, I think this fits better as a description of Gavin Schmidt’s article. It contains multiple mistakes and misconceptions, which I think it worth setting the record straight on. Continue reading

A Return to Polar Urals: Wilson et al 2016

Wilson et al 2016, like D’Arrigo et al 2006, includes a ‘Polar Urals’ chronology as one of its components.  Tree ring chronologies from Polar Urals and Yamal have long been a contentious issue at Climate Audit, dating back to the earliest days (see tags Yamal, Polar Urals).

Whereas the D’Arrigo et al 2006 version had one of the biggest blades in the entire D’Arrigo et al 2006 portfolio (ending at nearly 4 sigma, with the smooth at 2.5 sigma), the new version, while slightly elevated in the 20th century, remains under 1 sigma.


Figure 1. ‘Polar Urals’ chronologies (scaled to SD Units): top- from D’Arrigo et al 2006 (RCS); bottom – Wilson et al 2014.  The top series is actually the Yamal chronology (Briffa, 2000), while the bottom series is a Polar Urals MXD version from Briffa et al (2013). 

In today’s post, I’ll first discuss the dramatic reduction in the blade of the ‘Polar Urals’ chronology from D’Arrigo et al 2006 to Wilson et al 2016 (which uses a variation of the Polar Urals MXD chronology from Briffa et al 2013).  This doesn’t take long to explain.

I’ll then use the rest of the post to discuss the bewildering thicket of adjustments to the Polar Urals MXD chronology in Briffa et al 2013.  I would be astonished if any practising paleoclimatologist (including the lead author of Wilson et al 2016) has ever made any attempt to figure out what Briffa (or more likely, Melvin actually did) in Briffa et al 2013.  I doubt that the editor and peer reviewers did either. Continue reading

Picking Cherries in the Gulf of Alaska

The bias arising from ex post selection of sites for regional tree ring chronologies has been a long standing issue at Climate Audit, especially in connection with Briffa’s chronologies for Yamal and Polar Urals (see tag.)  I discussed it most recently in connection with the Central Northwest Territories (CNWT) regional chronology of D’Arrigo et al 2006,  in which I showed a remarkable example of ex post selection.

In today’s post, I’ll show a third vivid example of the impact of ex post site selection on the divergence problem in Gulf of Alaska regional chronologies.  I did not pick this chronology as a particularly lurid example after examining multiple sites. This chronology is the first column in the Wilson et al 2016 N-TREND spreadsheet and was the first site in that collection that I examined closely.  It is also a site for which most (but not all) of the relevant data has been archived and which can therefore be examined. Unfortunately, data for many of the Wilson et al 2016 sites has not been been archived and, if past experience is any guide, it might take another decade to become available (by which time we will have all “moved on”). Continue reading

Cherry-Picking by D’Arrigo

One of the longest standing Climate Audit issues with paleoclimate reconstructions is ex post decisions on inclusion/exclusion of data, of which ex post decisions on inclusion/exclusion of sites/data in “regional [treering] chronologies” is one important family.  This was the issue in the original Yamal controversy, in response to which Briffa stated that they “would never select or manipulate data in order to arrive at some preconceived or unrepresentative result”. However, Briffa and associates have never set out ex ante criteria for site inclusion/exclusion, resulting in the methodology for Briffa regional reconstructions seeming more like Calvinball than science, as discussed in many CA posts.

Unlike Briffa, D’Arrigo has candidly admitted to the selection of data to arrive at a preconceived result. At the 2006 NAS panel workshop, Rosanne D’Arrigo famously told the surprised panelists that you had to pick cherries if you want to make cherry pie.   Again in 2009 (though not noticed at the time), D’Arrigo et al 2009 stated that they could “partially circumvent” the divergence problem by only using data that went up:

The divergence problem can be partially circumvented by utilizing tree-ring data for dendroclimatic reconstructions from sites where divergence is either absent or minimal. (Wilson et al., 2007; Buntgen et al., in press; Youngblut and Luckman, in press).

Portfolio managers would have like to have a similar option in constructing portfolios: if, after the fact, you pick stocks that went up, it would be trivially easy to “circumvent” market downturns.  That paleoclimatologists seem so obtuse to this simple observation is a major puzzlement.

In today’s post, I’ll show an absolutely breathtaking example of biased ex post picking by D’Arrigo et al in the D’Arrigo et al 2006 CNWT chronology.  It was impossible for anyone to identify the full measure of this bias at the time or for many years afterwards, as D’Arrigo and coauthors failed to archive data at the time and refused to provide it when requested. They were supported in their refusal by IPCC WG1 Co-Chair Susan Solomon, who, as CA readers are aware, threatened me with expulsion as an IPCC AR4 reviewer for seeking supporting data for D’Arrigo et al 2006 (then cited in preprint by AR4).   The data showing the cherry picking only became available in 2014 as part of a belated archiving program in the final year of Gordon Jacoby’s life.  

Continue reading

Marvel et al.: Implications of forcing efficacies for climate sensitivity estimates – update

A guest article by Nicholas Lewis


In a recent article I discussed the December 2015 Marvel et al.[1] paper, which contends that estimates of the transient climate response (TCR) and equilibrium climate sensitivity (ECS) derived from recent observations of changes in global mean surface temperature (GMST) are biased low. Marvel et al. reached this conclusion from analysing the response of the GISS-E2-R climate model in simulations over the historical period (1850–2005) when driven by six individual forcings, and also by all forcings together, the latter referred to as the ‘Historical’ simulation. The six individual forcings analysed were well-mixed greenhouse gases (GHG), anthropogenic aerosols, ozone, land use change, solar variations and volcanoes. Ensembles of five simulation runs were carried out for each constituent individual forcing, and of six runs for all forcings together. Marvel et al.’s estimates were based on averaging over the relevant simulation runs; taking ensemble averages reduces the impact of random variability.

In this article I will give a update on the status of two points I tentatively raised in my original article. Continue reading