Varved Inconsistency

Since AR4, there have been a series of new multiproxy studies, several of which were cited in AR5 (Mann et al 2008; Ljungqvist et al 2010; Christiansen and Ljungqvist 2012; Shi et al 2013). A distinctive feature of these and other recent multiproxy studies is the incorporation of varve thickness and near-equivalent mass accumulation rate (MAR) series, in which varve thickness (positively oriented) is interpreted as a direct proxy for temperature. The following table shows the usage of varve thickness and near-equivalent mass accumulation rate (MAR) series in post-AR4 multiproxy studies (“long” series shown below). It is evident that the varve thickness data in multiproxy studies is anything but “independent”.

varve thickness in multiproxy table
Table 1. Varve thickness and MAR (mass accumulation rate) series used in multiproxy studies which are both “long” (including the medieval period) and which have not been truncated in the modern period. Both logged and unlogged versions are used. In a couple of cases, the mass accumulation rate is limited to organics (“dark”). I’ve also included the Igaliku pollen accumulation rate series, because it appears to me to be closely related to MAR series. XRD (Xray density not included).

One of the most obvious features of the above table is the repeated use of a small number of varve thickness series used in Kaufman et al 2009: Big Round, Blue, C2, Donard, Iceberg and Lower Murray Lakes. Five of the six series were used in Shi et al 2013. In my recent discussion of Shi et al 2013, I observed that a composite of the five series (and the same is true for all six) had something of an HS-shape, though the series otherwise had negligible common “signal” (as demonstrated clearly by their eigenvalues). Further, several of the series (especially Iceberg which had been discussed in prior CA posts) had serious problems, compromising or potentially compromising any potential utility as a temperature proxy. This certainly suggested to me that the somewhat HS-ness of the varve thickness composite was more likely to be an artifact of selection from a noisy network rather than actual scientific knowledge. Skeptic blogs have long discussed this phenomenon, but it is one to which academic literature in the field has been wilfully obtuse.

Blog discussion has been mostly based on red noise examples. So I think that readers may be interested in seeing the phenomenon at work with actual data.

In the course of examining literature on varves, it quickly became evident that specialist literature prior to the relatively recent multiproxy articles had regarded thick varves as evidence of glacier advance (rather than “warmth”). Readers (and myself) wondered how the prior consensus (so to speak) that thick varves were related to glacier advance (and vice versa) had been replaced by a model in which thick varves were now interpreted as evidence of warmer temperatures. This proved to be an interesting backstory. I’ll also contrast the varve thickness series from Iceberg Lake, a canonical series in Kaufman et al 2009 and subsequent multiproxy studies, with “non-canonical” varve thickness series from Silvaplana, Switzerland and Hvitarvatn, Iceland, where thin varves are interpreted as evidence of warmth. Continue reading

Does the observational evidence in AR5 support its/the CMIP5 models’ TCR ranges?

A guest post by Nic Lewis

Steve McIntyre pointed out some time ago, here, that almost all the global climate models around which much of the IPCC’s AR5 WGI report was centred had been warming faster than the real climate system over the last 35-odd years, in terms of the key metric of global mean surface temperature. The relevant figure from Steve’s post is reproduced as Figure 1 below.

Fig.1 TCR post CMIP5 79-13 temp trends_CA24Sep13
Figure 1 Modelled versus observed decadal global surface temperature trend 1979–2013

Temperature trends in °C/decade. Models with multiple runs have separate boxplots; models with single runs are grouped together in the boxplot marked ‘singleton’. The orange boxplot at the right combines all model runs together. The default settings in the R boxplot function have been used. The red dotted line shows the actual increase in global surface temperature over the same period per the HadCRUT4 observational dataset.


Transient climate response

Virtually all the projections of future climate change in AR5 are based on the mean and range of outcomes simulated by this latest CMIP5 generation of climate models (AOGCMs). Changes in other variables largely scale with changes in global surface temperature. The key determinant of the range and mean level of projected increases in global temperature over the rest of this century is the transient climate response (TCR) exhibited by each CMIP5 model, and their mean TCR. Model equilibrium climate sensitivity (ECS) values, although important for other purposes, provide little information regarding surface warming to the last quarter of this century beyond that given by TCR values.

TCR represents the increase in 20-year mean global temperature over a 70 year timeframe during which CO2 concentrations, rising throughout at 1% p.a. compound, double. More generally, paraphrasing from Section 10.8.1 of AR5 WG1,TCR can be thought of as a generic property of the climate system that determines the global temperature response ΔT to any gradual increase in (effective) radiative forcing (ERF – see AR5 WGI glossary, here ) ΔF taking place over a ~70-year timescale, normalised by the ratio of the forcing change to the forcing due to doubling CO2, F2xCO2: TCR = F2xCO2 ΔT/ΔF. This equation permits warming resulting from a gradual change in ERF over a 60–80 year timescale, at least, to be estimated from the change in ERF and TCR. Equally, it permits TCR to be estimated from such changes in global temperature and in ERF.

The TCRs of the 30 AR5 CMIP5 models featured in WGI Table 9.5 vary from 1.1°C to 2.6°C, with a mean of slightly over 1.8°C. Many projections in AR5 are for changes up to 2081–2100. Applying the CMIP5 TCRs to the changes in CO2 concentration and other drivers of climate change from the first part of this century up to 2081–2100, expressed as the increase in total ERF, explains most of the projected rises in global temperature on the business-as-usual RCP8.5 scenario, although the relationship varies from model to model. Overall the models project about 10–20% faster warming than would be expected from their TCR values, allowing for warming ‘in-the-pipeline’. That discrepancy, which will not be investigated in this article, implies that the mean ‘effective’ TCR of the AR5 CMIP5 models for warming towards the end of this century under RCP8.5 is in the region of 2.0–2.2°C.


Observational evidence in AR5 about TCR

AR5 gives a ‘likely’ (17–83% probability) range for TCR of 1.0–2.5°C, pretty much in line with the 5–95% CMIP5 model TCR range (from fitting a Normal distribution) but with a downgraded certainty level. How does that compare with the observational evidence in AR5? Figure 10.20a thereof, reproduced as Figure 2 here, shows various observationally based TCR estimates.

 Fig.10.20a
Figure 2. Reproduction of Figure 10.20a from AR5
Bars show 5–95% uncertainty ranges for TCR.[i]

.
On the face of it, the observational study TCR estimates in Figure 2 offer reasonable support to the AR5 1.0–2.5°C range, leaving aside the Tung et al. (2008) study, which uses a method that AR5 WGI discounts as unreliable. However, I have undertaken a critical analysis of all these TCR studies, here. I find serious fault with all the studies other than Gillett et al. (2013), Otto et al. (2013) and Schwartz (2012). Examples of the faults that I find with other studies are:

Harris et al. (2013): This perturbed physics/parameter ensemble (PPE) study’s TCR range, like its ECS range, almost entirely reflects the characteristics of the UK Met Office HadCM3 model. Despite the HadCM3 PPE (as extended by emulation) sampling a wide range of values for 31 key model atmospheric parameters, the model’s structural rigidities are so strong that none of the cases results in the combination of low-to-moderate climate sensitivity and low-to-moderate aerosol forcing that the observational data best supports – nor could perturbing aerosol model parameters achieve this.

Knutti and Tomassini (2008): This study used initial estimates of aerosol forcing totalling −1.3 W/m² in 2000, in line with AR4 but far higher than the best estimate in AR5. Although it attempted to observationally-constrain these initial estimates, the study’s use of only global temperature data makes it impossible to separate properly greenhouse gas and aerosol forcing, the evolution of which are very highly (negatively) correlated at a global scale. The resulting final estimates of aerosol forcing are still significantly stronger than the AR5 estimates, biasing up TCR estimation. The use of inappropriate uniform and expert priors for ECS in the Bayesian statistical analysis further biases TCR estimation.

Rogelj et al. (2012): This study does not actually provide an observationally-based estimate for TCR. It explicitly sets out to generate a PDF for ECS that simply reflects the AR4 ‘likely’ range and best estimate; in fact it reflects a slightly higher range. Moreover, the paper and its Supplementary Information do not even mention estimation of TCR or provide any estimated PDF for TCR.

Stott and Forest (2007): This TCR estimate is based on the analysis in Stott et al. (2006), an AR4 study from which all four of the unlabelled grey dashed-line PDFs in Figure 10.20a are sourced. It used a detection-and-attribution regression method applied to 20th century temperature observations to scale TCR values, and 20th century warming attributable to greenhouse gases, for three AOGCMs. Gillett et al. (2012) found that just using 20th century data for this purpose biased TCR estimation up by almost 40% compared with when 1851–2010 data was used. Moreover, the 20th century greenhouse gas forcing increase used in Stott and Forest (2007) to derive TCR (from the Stott et al. (2006) attributable warming estimate) is 11% below that per AR5, biasing up its TCR estimation by a further 12%.

In relation to the three studies that I do not find any serious fault with, some relevant details from my analysis are:

Gillett et al. (2013): This study uses temperature observations over 1851–2010 and a detection-and-attribution regression method to scale AOGCM TCR values. The individual CMIP5 model regression-based observationally-constrained TCRs shown in a figure in the Gillett et al. (2013) study imply a best (median[ii]) estimate for TCR of 1.4°C, with a 5–95% range of 0.8–2.0°C.[iii] That compares with a range of 0.9–2.3°C given in the study based on a single regression incorporating all models at once, which it is unclear is as suitable a method.

Otto et al. (2013): There are two TCR estimates from this energy budget study included in Figure 10.20a. One estimate uses 2000–2009 data and has a median of 1.3°C, with a 5–95% range of 0.9–2.0°C. The other estimate uses 1970–2009 data and has a median of slightly over 1.35°C, with a 5–95% range of 0.7–2.5°C. Since mean forcing was substantially higher over 2000–2009 than over 1970–2009, and was also less affected by volcanic activity, the TCR estimate based on 2000–2009 data is less uncertain, and arguably more reliable, than that based on 1970–2009 data.

Schwartz (2012): This study derived TCR by zero-intercept regressions of changes, from the 1896–1901 mean, in observed global surface temperature on corresponding changes in forcing, up to 2009, based on forcing histories used in historical model simulations. The mean change in forcing up to 1990 (pre the Mount Pinatubu eruption) per the five datasets used to derive the TCR range is close to the best estimate of the forcing change per AR5. The study’s TCR range is 0.85–1.9°C, with a median estimate of 1.3°C.

So the three unimpeached studies in Figure 10.20a support a median TCR estimate of about 1.35°C, and a top of the ‘likely’ range for TCR of about 2.0°C based on downgrading 5–95% ranges, following AR5.


The implication for TCR of the substantial revision in AR5 to aerosol forcing estimates

There has been a 43% increase in the best estimate of total anthropogenic radiative forcing between that for 2005 per AR4, and that for 2011 per AR5. Yet global surface temperatures remain almost unchanged: 2012 was marginally cooler than 2007, whilst the trailing decadal mean temperature was marginally higher. The same 0.8°C warming now has to be spread over a 43% greater change in total forcing, natural forcing being small in 2005 and little different in 2012. The warming per unit of forcing is a measure of climate sensitivity, in this case a measure close to TCR, since most of the increase in forcing has occurred over the last 60–70 years. It follows that TCR estimates that reflect the best estimates of forcing in AR5 should be of the order of 30% lower than those that reflected AR4 forcing estimates.

Two thirds of the 43% increase in estimated total anthropogenic forcing between AR4 and AR5 is accounted for by revisions to the 2005 estimate, reflecting improved understanding, with the increase in greenhouse gas concentrations between 2005 and 2011 accounting for almost all of the remainder. Almost all of the revision to the 2005 estimate relates to aerosol forcing. The AR5 best (median) estimate of recent total aerosol forcing is −0.9 W/m2, a large reduction from −1.3 W/m2 (for a more limited measure of aerosol forcing) in AR4. This reduction has major implications for TCR and ECS estimates.

Moreover, the best estimate the IPCC gives in AR5 for total aerosol forcing is not fully based on observations. It is an expert judgement based on a composite of estimates derived from simulations by global climate models and from satellite observations. The nine satellite-observation-derived aerosol forcing estimates featured in Figure 7.19 of AR5 WGI range from −0.09 W/m2 to −0.95 W/m2, with a mean of −0.65 W/m2. Of these, six satellite studies with a mean best estimate of −0.78 W/m2 were taken into account in deciding on the −0.9 W/m2 AR5 composite best estimate of total aerosol forcing.


TCR calculation based on AR5 forcing estimates

Arguably the most important question is: what do the new ERF estimates in AR5 imply about TCR? Over the last century or more we have had a period of gradually increasing ERF, with some 80% of the decadal mean increase occurring fairly smoothly, volcanic eruptions apart, over the last ~70 years. We can therefore use the TCR = F2xCO2 ΔT/ΔF equation to estimate TCR from ΔT and ΔF, taking the change in each between the means for two periods, each long enough for internal variability to be small.

That is exactly the method used, with a base period of 1860–1879, by the ‘energy budget’ study Otto et al. (2013), of which I was a co-author. That study used estimates of radiative forcing that are approximately consistent with estimates from Chapters 7 and 8 of AR5, but since AR5 had not at that time been published the forcings were actually diagnosed from CMIP5 models, with an adjustment being made to reflect satellite-observation-derived estimates of aerosol forcing. However, in a blog-published study, here, I did use the same method but with forcing estimates (satellite-based for aerosols) taken from the second draft of AR5. That study estimated only ECS, based on changes between 1871–1880 and 2002–2011, but a TCR estimate of 1.30°C is readily derived from information in it.

We can now use the robust method of the Otto et al. (2013) paper in conjunction with the published AR5 forcing best (median) estimates up to 2011, the most recent year given. The best periods to compare appear to be 1859–1882 and 1995–2011. These two periods are the longest ones in respectively the earliest and latest parts of the instrumental period that were largely unaffected by major volcanic eruptions. Volcanic forcing appears to have substantially less effect on global temperature than other forcings, and so can distort TCR estimation. Using a final period that ends as recently as possible is important for obtaining a well-constrained TCR estimate, since total forcing (and the signal-to-noise ratio) declines as one goes back in time. Measuring the change from early in the instrumental period maximises the ratio of temperature change to internal variability, and since non-volcanic forcings were small then it matters little that they are known less accurately than in recent decades. Moreover, these two periods are both near the peak of the quasi-periodic ~65 year AMO cycle. Using a base period extending before 1880 limits one to using the HadCRUT surface temperature dataset. However, that is of little consequence since the HadCRUT4 v2 change in global temperature from 1880–1900 to 1995–2011 is identical to that per NCDC MLOST and only marginally below that per GISS.

In order to obtain a TCR estimate that is as independent of global climate models as possible, one should scale the aerosol component of the AR5 total forcing estimates to match the AR5 recent satellite-observation-derived mean of −0.78 W/m2. Putting this all together gives ΔF = 2.03 W/m2 and ΔT = 0.71, which, since AR5 uses F2xCO2 = 3.71 W/m, gives a best estimate of 1.30°C for TCR. The best estimate for TCR would be 1.36°C without scaling aerosol forcing to match the satellite-observation derived mean.

So, based on the most up to date numbers from the IPCC AR5 report itself and using the most robust methodology on the data with the best signal-to-noise ratio, one arrives at an observationally based best estimate for TCR of 1.30°C, or 1.36°C based on the unadjusted AR5 aerosol forcing estimate.

I selected 1859–1882 and 1995–2011 as they seem to me to be the best periods for estimating TCR. But it is worth looking at longer periods as well, even though the signal-to-noise ratio is lower. Using 1850–1900 and 1985–2011, two periods with mean volcanic forcing levels that, although significant, are well matched, gives a TCR best estimate of 1.24°C, or 1.30°C based on the unadjusted AR5 aerosol forcing estimate. The TCR estimates are even lower using 1850–1900 to 1972–2011, periods that are also well-matched volcanically.

What about estimating TCR over a shorter timescale? If one took ~65 rather than ~130 years between the middles of the base and end periods, and compared 1923–1946 with 1995–2011, the TCR estimates would be almost unchanged. But there is some sensitivity to the exact periods used. An alternative approach is to use information in the AR5 Summary for Policymakers (SPM) about anthropogenic-only changes over 1951–2010, a well-observed period. The mid-range estimated contributions to global mean surface temperature change over 1951–2010 per Section D.3 of the SPM are 0.9°C for greenhouse gases and ‑0.25°C for other anthropogenic forcings, total 0.65°C. The estimated change in total anthropogenic radiative forcing between 1950 and 2011 of 1.72 Wm-2 per Figure SPM.5, reduced by 0.04 Wm-2 to adjust to 1951–2010, implies a TCR of 1.4°C after multiplying by an F2xCO2 of 3.71 Wm-2. When instead basing the estimate on the linear trend increase in observed total warming of 0.64°C over 1951–2010 per Jones et al. (2013) – the study cited in the section to which the SPM refers – (the estimated contribution from internal variability being zero) and the linear trend increase in total forcing per AR5 of 1.73 Wm-2, the implied TCR is also 1.4°C. Scaling the AR5 aerosol forcing estimates to match the mean satellite observation derived aerosol forcing estimate would reduce the mean of these two TCR estimates to 1.3°C.


So does the observational evidence in AR5 support its/the CMIP5 models’ TCR ranges?

The evidence from AR5 best estimates of forcing, combined with that in solid observational studies cited in AR5, points to a best (median) estimate for TCR of 1.3°C if the AR5 aerosol forcing best estimate is scaled to match the satellite-observation-derived best estimate thereof, or 1.4°C if not (giving a somewhat less observationally-based TCR estimate). We can compare this with model TCRs. The distribution of CMIP5 model TCRs is shown in Figure 3 below, with a maximally observationally-based TCR estimate of 1.3°C for comparison.
.

Fig.3 TCR post CMIP5 TCRs Ross

Figure 3. Transient climate response distribution for CMIP5 models in AR5 Table 9.5
The bar heights show how many models in Table 9.5 exhibit each level of TCR

.
Figure 3 shows an evident mismatch between the observational best estimate and the model range. Nevertheless, AR5 states (Box 12.2) that:

“the ranges of TCR estimated from the observed warming and from AOGCMs agree well, increasing our confidence in the assessment of uncertainties in projections over the 21st century.”

How can this be right, when the median model TCR is 40% higher than an observationally-based best estimate of 1.3°C, and almost half the models have TCRs 50% or more above that? Moreover, the fact that effective model TCRs for warming to 2081–2100 are the 10%–20% higher than their nominal TCRs means that over half the models project future warming on the RCP8.5 scenario that is over 50% higher than what an observational TCR estimate of 1.3°C implies.

Interestingly, the final draft of AR5 WG1 dropped the statement in the second draft that TCR had a most likely value near 1.8°C, in line with CMIP5 models, and marginally reduced the ‘likely’ range from 1.2–2.6°C to 1.0–2.5°C, at the same time as making the above claim.

So, in their capacity as authors of Otto et al. (2013), we have fourteen lead or coordinating lead authors of the WG1 chapters relevant to climate sensitivity stating that the most reliable data and methodology give ‘likely’ and 5–95% ranges for TCR of 1.1–1.7°C and 0.9–2.0°C, respectively. They go on to suggest that some CMIP5 models have TCRs that are too high to be consistent with recent observations. On the other hand, we have Chapter 12, Box 12.2, stating that the ranges of TCR estimated from the observed warming and from AOGCMs agree well. Were the Chapter 10 and 12 authors misled by the flawed TCR estimates included in Figure 10.20a? Or, given the key role of the CMIP5 models in AR5, did the IPCC process offer the authors little choice but to endorse the CMIP5 models’ range of TCR values?
.


[i] Note that the PDFs and ranges given for Otto et al. (2013) are slightly too high in the current version of Figure 10.20a. It is understood that those in the final version of AR5 will agree to the ranges in the published study.

[ii] All best estimates given are medians (50% probability points for continuous distributions), unless otherwise stated.

[iii] This range for Gillett et al. (2013) excludes an outlier at either end; doing so does not affect the median.

The “Canonical” Varve Thickness Series

Shi et al 2013 use the following five varve thickness series, all of which have become widely used in multiproxy series since their introduction in Kaufman et al 2009: Big Round Lake and Donard Lake, Baffin Island; Lower Murray Lake, Ellesmere Island; and Blue Lake and Iceberg Lake, Alaska. Some of these proxies have been discussed from time to time, with an especially detailed discussion of Iceberg Lake (see tag here.)

The figure below compares a simple scaled average of these five series to the Hvitarvatn varve thickness series (inverted so that the Little Ice Age is shown as “cold” rather than warm. See accompanying discussion of Hvitarvatn here. Whereas Miller et al reported that the 19th century at Hvitarvatn was the period of greatest glacier advance in the entire Holocene, the “Kaufman five” show 19th century levels similar to the 11th century medieval period, with an anomalously ‘warm” 20th century:

shi-2013_compare-to-hvitarvatn
Figure 1. Top – average of five Shi et al varve thickness series; bottom – Hvitarvatn varve thickness (inverted). All in SD Units.

There is no “common signal” in the Kaufman Five according to common methods. The median inter-series correlation is 0.00605, with negative interseries correlation as common as positive interseries correlation. If one examines the eigenvalues of the correlation matrix – a useful precaution in assessing whether the data contains a “common signal” – there are no eigenvalues that are separable from red noise as evident in the barplot below.

shi-2013_varve-eigenvalues
Figure 2. Eigenvalues of (Kaufman Five) Varve Thickness Series

Despite the overwhelming lack of common signal according to these criteria, the average of the Kaufman Five has a distinctly elevated 20th century. Here is a plot of the Kaufman Five. The lack of correlation and lack of significant eigenvalues is evident in the plot, where there is little in common among the series except for one feature: the 20th century in each series is somewhat elevated relative to the 19th century. (As noted above, the average of the five series has a somewhat elevated 20th century, but is relatively featureless in centuries prior to the 20th century, especially in comparison to the well-dated Hvitarvatn series.)

shi-2013_five-sediments
Figure 3. Five Varve thickness series used in Shi et al 2013 (SD Units.)

When parsed in detail, each of the Kaufman Five has troubling defects, some of which I’ll briefly discuss today and which I’ll try to follow up on.

The Iceberg Lake, Alaska series has profound inhomogeneities, especially in its 20th century portion. A major inhomogeneity is that varve thickness is related to distance to the inlet, an observation first made in comments at Climate Audit in comments on Loso 2006. Loso 2009 conceded this point (without mentioning Climate AUdit though it did acknowledge WIllis Eschenbach who corresponded with Loso on a different point) but its remedy (taking logarithms) was hopelessly inadequate to the problem. Dietrich and Loso 2012 acknowledges that inhomogeneities impact their reconstruction, but did not amend or withdraw the earlier series. Interestingly, Dietrich and Loso report glacier advance in Alaska commencing around 1250AD, almost exactly contemporaneous with the well-dated Hvitarvatn advance. The Iceberg Lake series, as used, has a late 20th century uptick coinciding with a major inhomogeneity, the effect of which cannot be separated under any plausible technique known to me.

Major features of the Big Round Lake series (as I’ve observed previously) correspond to major features of the Hvitarvatn series and there is a much higher inter-series correlation between these two series than to other series in the Kaufman Five. The only problem is that this correlation requires inversion of the Big Round series so that thicker varves are generated in the Little Ice Age. There are important geological parallels between the two sites: like Hvitarvatn, Big Round is a proglacial lake, the sediment volume of which is related to proximity of a nearby glacier, which advanced in the Little Ice Age to its Holocene maximum and receded in the 20th century. In order to use the Big Round series in its present orientation, specialists have to explain why its behavior is opposite to Hvitarvatn. And why one should interpret Big Round varve data as showing a Little Warm Period in Baffin Island during Iceland’s Little Ice Age (especially when glacier lines moved 500 m lower in Baffin Island during this period.) The reason why Big Round varves are oriented thick-up by the original authors (Thomas et al 2009) is that there is a positive correlation in the late 20th century between varve thickness and local temperatures. Together with the exclusion of the inhomogeneous Iceberg Lake series, inverting this series (as seems required) would obviously impact the average of the canonical series.

Like Hvitarvatn and Big Round Lake, Donard Lake (Baffin Island) is a proglacial lake whose sediment volume is controlled by proximity of a nearby glacier (Caribou Glacier). Once again, this glacier reached its Holocene maximum in the Little Ice Age, prior to its 20th century retreat. However, the Donard Lake varve thickness series has a slightly negative correlation to the Big Round Lake series. Rather than simply averaging these two incompatible series, specialists need to closely re-examine the data to explain the inconsistency. Donard Lake dating is one thing that needs close examination.

Thomas and associates have recently reported a third proglacial varve thickness from Baffin Island (Ayr Lake), for which they unable to report a significant correlation to instrumental temperature. Thus, they did not report a temperature reconstruction for this site. However, the absence of such correlation surely bleeds back to the other series, inviting a reconsideration of whether their supposed correlations to temperature were spurious – particularly in the context of their inconsistency with the well-dated Hvitarvatn series.

Because varve thickness in these proglacial lakes is profoundly affected by glacier proximity, there is no homogeneous relationship between varve thickness and temperature

More on Hvitarvatn Varves

In a previous post on PAGES2K Arctic, I pointed out that they had used the Hvitarvatn, Iceland series (PAGES2K version shown below), upside-down to the interpretation of the original authors (Miller et al), who had interpreted thick varves as evidence of the Little Ice Age. A few days ago, Miller and coauthors archived a variety of series from Hvitarvatn, prompting me to review this data.

hvitarvatn
Figure 1. PAGES2K Arctic Hvitarvatn series. PAGES2K implicitly reconstruct a Little Warm Age during the period of glacier advance, the conventional Little Ice Age.

It turns out that Hvitarvatn is a very interesting lake sediment site for a variety of reasons. It is much more securely dated than the Baffin Island sites; it has been carefully described geologically; a variety of proxies have been calculated for the location; some proxies are available through the Holocene, not just the last 1000 years or so; and, there are a variety of other well-dated proxies near Iceland.

In geophysics, specialists work out from high-quality data to lower quality data, rather than throwing all of the data into a jumble (as paleoclimatologists do). In today’s post, I’m going to parse the Hvitarvatn data, which I will later use to parse the Baffin Island and Ellesmere Island varve data, all of which has become widely used since Kaufman et al 2009 (including PAGES2K, Ljungqvist and most recently Shi et al 2013).

Dating
Varve chronologies are determined by simply counting varves – which seems straightforward enough. In Iceland, a series of well-dated volcanic tephra permit a cross-check of varve counting. When reconciled to tephra, one of the varve chronologies was over 100 years short by the medieval period. The possibility of similar errors in varve chronologies in Baffin Island and elsewhere needs to be kept firmly in mind.

Varve Thickness and Glacier Advance
A number of canonical varve sites are from proglacial lakes (e.g. Big Round, Donard and Ayr in Baffin Island; Iceberg Lake in Alaska). Like these other sites, Hvitarvatn is a proglacial lake in the watershed of the Langjökull ice cap. Miller et al observe that varve thickness is controlled by the glacier:

Varve thickness is controlled by the rate of glacial erosion and efficiency of subglacial discharge from the adjacent Langjökull ice cap.

They observe that the Langjokull ice cap had receded during the Holocene optimum and had only advanced to the lake during the last millennium. They dated the start of the Little Ice Age to ~1250AD. They dated a first phase of glacier advance between 1250 and 1500AD, with a second phase commencing ~1750AD and ending only around 1900AD. Miller et al report that, within the entire Holocene, ice-rafted debris occurred only during this second phase, especially during the 19th century.

…The largest perturbation began ca 1250 AD, signaling the onset of the Little Ice Age and the termination of three centuries of relative warmth during Medieval times. Consistent deposition of ice-rafted debris in Hvítárvatn is restricted to the last 250 years, demonstrating that Langjökull only advanced into Hvítárvatn during the coldest centuries of the Little Ice Age, beginning in the mid eighteenth century. This advance represents the glacial maximum for at least the last 3 ka, and likely since regional deglaciation 10 ka.

The two outlet glaciers terminating in Hvitarvatn, Norjurjokull and Suojurjokull, advance slowly into the lake, occupying their maximum lake area in the late 19th century, and retreat comparatively rapidly in the mid- to late 20th century.

we place the start of the LIA in the highlands in the mid-thirteenth century, when Langjökull began a series of two long periods of expansion, with high stands at ca 1500 AD and in the nineteenth century.

This study presents the first continuous record of ice cap extent for the entire Holocene and clearly demonstrates that the LIA contained the most extensive glacial advance of the Neoglacial interval. The strong multi-proxy signal at Hvítárvatn implies that the LIA was the coldest period of the last 8 ka and suggests that is unlikely for any non-surging Iceland glacier to have reached dimensions significantly larger than its LIA maximum at any time during the Holocene.

The figure below shows varve thickness and ice-rafted debris data from Hvitarvatn.

Ice-rafted debris is interpreted as evidence that the glacier had advanced into the lake. It is completely absent from the record through the Holocene Optimum and is rare until the second phase of the Little Ice Age, with maximum intensity in the 19th century with local maxima in 1940 and 1890, before declining rapidly with the 20th century recession of the glacier.

The varve record begins about 3000 BP. Varves thicken quite dramatically commencing about 1250AD, which Miller et al interpret as the start of the Little Ice Age. They interpret 1500-1760AD as a sort of standstill, with, as noted above, thick varves in the second phase (19th century) coinciding with ice-rafted debris. Varves continue to be thick during glacier recession in the 20th century, continuing to be thick into the warm 1930s.

hvitarvatn_varve-and-IRD_annotated
Figure 1. Hvitarvatn varve thickness and ice-rafted debris.

During the Holocene Thermal Maximum (HTM), the recession of the glacier meant that sediments were not varved. Miller et al utilized biogenic silica and information on diatoms to reconstruct HTM warmth, as shown below.
hvitarvatn biogenic

The Hvitarvatn data seems to provide rather secure information glacier advance and retreat, a topic of considerable importance in the older (pre-IPCC) paleoclimate literature (e.g. Matthes 1940).

Bradley and Jones 1993, an article which (rather than Mann et al 1998) did much to initiate the present programme of “multiproxy” studies, referred back to those older literature (Matthes 1940) in its introduction. They argued that records of glacier advance and retreat were compromised by the absence of continuous records and advocated instead reliance only on precisely-dated continuous records (tree rings and ice cores, but, in practice, mostly tree rings). This programme subsequently dominated the field, in part because of IPCC highlighting such studies.

At several sites (Soper Lake and Donard Lake, Baffin Island; Lower Murray Lake, Ellesmere Island), varve thickness has been said to be positively correlated with summer temperature. (Though no significant correlation between varves and temperature has been reported at similar sites: East Lake, Melville Island; Ayr Lake, Baffin Island).

PAGES2K appears to have incorporated Hvitarvatn varve thickness data into their dataset (with thick varves denoting warmth) on the basis of this practice at other Arctic sites. However, the original authors clearly interpreted thick varves as evidencing the Little Ice Age (the existence of which in Iceland is established by numerous indicators.) The PAGES2K version is clearly upside-down to this interpretation.

In an accompanying post, future post, I will compare Hvitarvatn varve thickness to varve thicknesses in the five “standard” varve thickness series used in Shi et al 2013 (and many other recent posts.)

Bristlecone Addiction in Shi et al 2013

Recently, Robert Way drew attention to Shi et al 2013 (online here), a multiproxy study cited in AR5, but not yet discussed at CA.

The paper by Shi et al (2013) is fairly convincing as to at least the last 1,000 years in the Northern Hemisphere. I am actually surprised that paper has not been discussed here since it aims at dealing with many of the criticisms of paleoclimate research. They use 45 annual proxies which are all greater than 1,000 years in length and all have a “demonstrated” temperature relationship based on the initial authors interpretations.

Robert correctly observed that Shi et al was well within the multiproxy specialization of Climate Audit and warranted coverage here. However, now that I’ve examined it, I can report that it is reliant on the same Graybill bristlecone chronologies that were used in Mann et al 1998-99. While critics of Climate Audit have taken exception to my labeling the dependence of paleoclimatologists on bristlecone chronologies as an “addiction”, until paleoclimatologists cease the repeated use of this problematic data in supposedly “independent” reconstructions, I think that the term remains justified.

While Robert reported that all these series had a “demonstrated” temperature relationship according to the initial authors’ interpretation, this is categorically untrue for Graybill’s bristlecone chronologies, where the original authors said that the bristlecone growth pulse was not due to temperature and sought an explanation in CO2 fertilization. (The preferred CA view is that the pulse is due to mechanical deformation arising from high incidence of strip barking in the 19th century, but that is a separate story.) As a matter of fact, by and large, the bristlecone chronologies failed even Mann’s pick-two test.

Shi et al also show a nodendro reconstruction. This has a much lesser semi-stick. This mainly uses a subset of Kaufman et al 2009 data. In a forthcoming post, I’ll show that even this weak result is questionable due to their use of contaminated data and upside-down data (not Tiljander, something different.)
Continue reading

Data Coverage in Cowtan and Way

As I was reading section 3 (Global temperature reconstruction) of the Cowtan and Way paper, I came across this text:

The HadCRUT4 map series was therefore renormalised to match the UAH baseline period of 1981-2010. For each map cell and each month of the year, the mean value of that cell during the baseline period was determined. If at least 15 of the 30 possible observations were present, an offset was applied to every value for that cell and month to bring the mean to zero; otherwise the cell was marked as unobserved for the whole period.

Renormalization is not a neutral step – coverage is very slightly reduced, however the impact of changes in coverage over recent periods is also reduced. Coverage of the renormalized HadCRUT4 map series is reduced by about 2%.

This raised the question of the general overall coverage of the hadcrut dataset used in the study as well as the effect of any changes in that coverage due to the methodology of the analysis in that paper. To address these issues, the data available on the paper’s website was downloaded and analysed using the R statistical package. The specific data used was contained in the file HadCRUT.4.1.1.0.median.txt which can be downloaded as part of the full data used in the paper.

Continue reading

Behind the SKS Curtain

As a preamble and reprise, I think that it is reasonable for Cowtan and Way to take exception with HadCRU’s failure to estimate temperature in Arctic gridcells and to propose methods for estimating this temperature. At a time when the climate community argued that differences between the major indices and accessibility to CRU data didn’t “matter”, I thought that both mattered. One of the reasons for transparency in CRU data and methods was so that interested parties could carry out their own assessments, as Cowtan and Way have done. They have diagnosed a downward bias in recent HadCRU results. On previous occasions, I’ve observed that the community is more alert to errors that go the “wrong way” than to errors that go the “right way” and this opinion remains unchanged. As noted in my previous post, it doesn’t appear to me that their slight upward revision in temperature estimates has a material impact on the discrepancy between models and observations – a discrepancy which remains, despite efforts to spin otherwise.

In today’s post, I’ve re-examined Robert Way’s contributions to the secret SKS forum, where both he and Cowtan (Kevin C) have been long-time contributors. In my first post, I took exception to Way calling me a “conspiracy wackjob”. However, relative to the tenor of other SKS posts in which their colleagues fantasize about “ripping” out Anthony Watts’ throat and Anthony and I being perp-walked in handcuffs, Way’s language was relatively mild.

In addition, re-reading the relevant threads, other than a couple of occasions (ones to which I had taken exception), Way’s language was mostly temperate and well-removed from the conspiratorial fantasies about the “Denial Machine” that pervade too much of the SKS forum. In addition, this re-reading showed that, on numerous occasions, Way had agreed with Climate Audit critiques, sometimes in very forceful terms and usually against SKS forum opposition. Way typically accompanied these agreements with sideswipes to evidence his disdain for Climate Audit, but seldom, if ever, contradicted things that I had actually said.

I think that readers will be surprised at the degree of Way’s endorsement of the Climate Audit critique of Team paleoclimate practices.

Continue reading

Cowtan and Way 2013

There has been some discussion of Cowtan and Way 2013 take on HadCRUT4 at Lucia’s, Judy Curry’s, Nick Stokes and elsewhere. HadCRUt4 has run cooler than other datasets (including UAH satellite) in recent years. Cowtan and Way observe that HadCRU does not estimate temperature in many Arctic gridcells. Because Arctic temperatures have risen more than low-latitude temperatures, they state that recent HadCRU temperatures are biased low. (Since GISS extrapolates into the Arctic, it is less affected by this bias.)

In the context of IPCC SOD FIgure 1.5 (or similar comparison of models and observations), CW13 is slightly warmer than HadCRUT4 but the difference is small relative to the discrepancy between models and observations; the CW13 variation is also outside the Figure 1.5 envelope.

cowtanway2013 vs ipcc ar5sod figure 1_5
Figure 1. Cowtan and Way 2013 hybrid plotted onto IPCC AR5SOD Figure 1.5

Next, here is a simple plot showing the difference between the CW13 hybrid and HadCRUT 4. Up to the end of 2005, there was a zero trend between the two; the difference has arisen entirely since 2005.

cowtanway2013_difference between hadcru4 and cw2013
Figure 2. Delta between CW Hybrid (basis 1961-1990) and HadCRUT4.

In their online commentary, Cowtan and Way praise Hansen for being the first person to report the effect of missing Arctic data on global temperature. However, no material discrepancy had arisen between their index and the HadCRUT4 index as of 2005 so that Hansen was, according to Cowtan and Way’s own data, observing a discrepancy that had not yet arisen, making their following praise to Hansen seem somewhat premature:

Probably the first mention of an underestimation of recent warming due to poor Arctic coverage comes from Hansen in 2006, who sought to explain why the NASA temperature data showed 2005 as being a record breaking warm year, in contrast to the Met Office temperature record.

That there are continuing defects in HadCRU methodology should hardly come as a surprise to CA readers. Attempts to reconcile and/or explain discrepancies between HadCRU and GISS also seem worthwhile to me.

Nor do efforts to apply kriging seem misplaced to me in principle. On the contrary, for someone with experience in ore reserves, it seems entirely natural e.g. see for example, some of Jeff Id’s discussion of Antarctica. I notice that their methodology results in changes to the Central England gridcell. While I don’t object to the use of kriging or similar methods to estimate values in missing gridcells, I don’t see any benefit to altering values in known gridcells, if that’s what’s happening here. (I haven’t parsed their methods and don’t plan to do so at this time.)

Co-author Way was an active participant at the secret SKS forum, where he actively fomented conspiracy theory allegations. Uniquely among participants in the secret SKS forum, he conceded that Climate Audit was frequently correct in its observations (“The fact of the matter is that a lot of the points he [McIntyre] brings up are valid”) and urged care in contradicting Climate Audit (“I wouldn’t want to go up against that group, between them there is a lot of statistical power to manipulate and make the data say what it needs to say.”) [Update Nov 21: While Way did wrongly associate me with conspiracy theory on a couple of occasions, including a tasteless accusation of being a "conspiracy wackjob", the vast majority of his language is temperate and reasonable and shows remarkable appreciation of the statistical points of our critique, with the slurs being a sort of incidental sideswipe. See the next post.]

Update: Here is annotation of IPCC AR5 SOD Figure 11.12 comparing observations to CMIP5 4.5, with both HadCRU4 outside envelope.
cw13 versus sod figure 11_12 cmip5

Bart Verheggen compared CMIP5 RCP8.5 to observations, saying that “recent observations are at the low side of the CMIP5 model range”.

However, my own calculations using RCP8.5 show that observations are outside the envelope. Verheggen’s calculations are not consistent with similar calculations by others (including IPCC) and I presume that he’s made an error somewhere.

cw2013 cmip5 rcp85 _baseline 1980_1999

Another Absurd Lewandowsky Correlation

Lewandowsky’s recent article, “Role of Conspiracist Ideation” continues Lewandowsky’s pontification on populations of 2, 1 and zero.

As observed here a couple of days ago, there were no respondents in the original survey who simultaneously believed that Diana faked her own death and was murdered. Nonetheless, in L13Role, Lewandowsky not only cited this faux example, but used it as a “hallmark” of conspiracist ideation:

For example, whereas coherence is a hallmark of most scientific theories, the simultaneous belief in mutually contradictory theories—e.g., that Princess Diana was murdered but faked her own death—is a notable aspect of conspiracist ideation [30].

However, this example is hardly an anomaly. The most cursory examination of L13 data shows other equally absurd examples.

One of the more amusing ones pertains to one of Lewandowsky’s signature assertions in Role, in which he claimed, echoing an almost identical assertion in Hoax, that “denial of the link between HIV and AIDS frequently involves conspiracist hypotheses, for example that AIDS was created by the U.S. Government [22–24].”

Lew reported a correlation of -0.111 between CYAIDS and CauseHIV, citing this correlation (together with negative correlations related to smoking and climate change) as follows:

The correlations confirm that rejection of scientific propositions is often accompanied by endorsement of scientific conspiracies pertinent to the proposition being rejected.

However, as with the fake Diana claims, Lewandowsky’s assertions are totally unsupported by his own data.

In the Role survey (1101 respondents), there were 53 who purported to disagree with the proposition that HIV caused AIDS (a vastly higher proportion than in the climate blog survey – a point that I will discuss separately). Of these 53 respondents, only two (3.8% of the 53 and 0.2% of the total) also purported to believe the proposition that the government caused AIDS. It is therefore simply untrue for Lewandowsky to assert, based on this data, that denial of the link between HIV and AIDS was either “frequently” or “often” accompanied by belief in the government AIDS conspiracy. It would be more accurate to say that it was “seldom” accompanied by such belief. Although Lewandowsky did not mention this, both of the two respondents who purported to believe this unlikely juxtaposition also believed that CO2 had caused serious negative damage over the past 50 years.

Lewanowsky’s assertion in Role about a supposed link between denial of a connection between HIV and AIDS and a government AIDS conspiracy had been previously made in Hoax not just once, but twice:

Likewise, rejection of the link between HIV and AIDS has been associated with the conspiratorial belief that HIV was created by the U.S. government to eradicate Black people (e.g., Bogart & Thorburn, 2005; Kalichman, Eaton, & Cherry, 2010)…

Thus, denial of HIV’s connection with AIDS has been linked to the belief that the U.S. gov¬ernment created HIV (Kalichman, 2009)

However, Lewandowsky’s false claim received even less support in the survey of stridently anti-skeptic Planet 3.0 blogs. Even with fraudulent responses, only 16 of 1145 (1.4%) purported to disagree with the proposition that HIV caused AIDS, and of these 16, only 2 (12.5%) also purported to endorse the CYAIDS conspiracy. These two respondents were the two respondents who implausibly purported to believe in every fanciful conspiracy. Even Tom Curtis of SKS argued that these responses were fraudulent. Without these two fraudulent responses, the real proportion in the blog survey is 0. Either way, the data contradicts Lewandowsky’s assertion that disagreement with the HIV-AIDS proposition is “often” or “frequently” accompanied by belief in the government AIDS conspiracy at the climate blogs surveyed by Lewandowsky.

Even though there were even fewer respondents supposedly subscribing to the unlikely propositions in the blog survey, the negative correlation between CYAIDS and CauseHIV propositions was even more extreme: a seemingly significant -0.31, though only the two fake respondents purported to hold the two unlikely propositions.

Update: I’ve added some plots below to illustrate how Lewandowsky’s calculations of correlation go awry.

The contingency table of CauseHIV and CYAIDS for the L13Hoax data is shown below, with the size of each circle proportional to the count in the contingency table. Most of the responses are identical – thus the large circle. Because there are only two respondents purporting to hold the two most unlikely views, this is a very faint dot. A correlation coefficient implies a linear fit and normality of residuals: visually this is obviously not the case. There are a variety of tests that could be applied and the supposed Lewandowsky correlation will fail all of them.

CauseHIVvsCYAIDS_Hoax

If one goes back to the underlying definition of a correlation coefficent, it is a dot-product of two vectors. In the context of a contingency table, this means that the contribution of each square in the contingency table to the correlation can be separately identified. I’ve done this in the graphic shown below, since the points, while elementary, are not immediately intuitive in these small-population situations. For each square in the contingency table, I’ve calculated the dot-product contribution and multiplied it by the count in the square, thereby giving the contribution to the correlation coefficient (which is the sum of the dot-product contributions.) The area of each circle shows the contribution to the correlation coefficient: pink shows a negative contribution.

There are a few interesting points to observe. In a setup where nearly all the responses are identical and at one extreme, these responses make a positive contribution to the correlation coefficient. Responses in which the respondent strongly disagrees with CYAIDS but only agrees with CauseHIV or in which the respondent strongly agrees with CauseHIV but only disagrees with CYAIDS make a negative contribution to the correlation. Respondents with simple agreement with CauseHIV and simple disagreement with CYAIDS make a strong contribution to the correlation coefficient. The two (fake) respondents make a very large contribution to the correlation coefficent despite only being two responses.

CauseHIVvsCYAIDS r contributions_Hoax

A Scathing Indictment of Federally-Funded Nutrition Research

Edward Archer of the University of South Carolina, lead author of a scathing examination of U.S. federally-funded nutrition research, has written an even more scathing editorial in The Scientist (here) (H/t Margaret Wente of the Toronto Globe and Mail here.)

Some quotes:

We may be witnessing the confluence of two inherent components of the human condition: incompetence and self-interest

And while the self-correcting nature of science necessitates failure, the vast majority of nutrition’s failures were engendered by a complete lack of familiarity with the scientific method.

Rather than training graduate students in the scientific method, and allowing their research to serve the needs of society, the field’s leaders choose to train their mentees to serve only their own professional needs—namely, to obtain grant funding and publish their research.

But by not training mentees in the basics of science and skepticism, the nutrition field has fostered the use of measures that are so profoundly dissonant with scientific principles that they will never yield a definitive conclusion. As such, we now have multiple generations of nutrition researchers who dominate federal nutrition research and the peer review of that work, but lack the critical thinking skills necessary to critique or conduct sound scientific research.

The subjective data yielded by poorly formulated nutrition studies are also the perfect vehicle to perpetuate a never-ending cycle of ambiguous findings leading to ever-more federal funding.

Archer culminates with the following allegation (going much further than any of my comparatively mild critiques of climate scientists):

Perhaps more importantly, to waste finite health research resources on pseudo-quantitative methods and then attempt to base public health policy on these anecdotal “data” is not only inane, it is willfully fraudulent… The fact that nutrition researchers have known for decades that these techniques are invalid implies that the field has been perpetrating fraud against the US taxpayers for more than 40 years—far greater than any fraud perpetrated in the private sector (e.g., the Enron and Madoff scandals).

The study was not funded by the U.S. federal government, but by an “unrestricted research grant” from Coca-Cola.

This study was funded via an unrestricted research grant from The Coca-Cola Company. The sponsor of the study had no role in the study design, data collection, data analysis, data interpretation, or writing of the report.

I wonder if federally-funded nutrition scientists will respond with attacks on the Coke Brothers.

Follow

Get every new post delivered to your Inbox.

Join 3,188 other followers