## IPCC Figure SPM.1

Since we are discussing uncertainty intervals, I have a question — probably a dumb one, but what the heck.

In the Summary for Policy Makers, IPCC 4th AR, there is a graph “Figure SPM.1” on page 3. See here: http://www.ipcc.ch/pdf/assessment-report/ar4/syr/ar4_syr_spm.pdf

The graph show “Global average surface temperature” since 1850. It includes individual data points for each year as well as an “uncertainty interval”. However, just looking at the graph, it appears that some 40+ years are outside this “uncertainty interval”. What does that “uncertainty interval” mean when 25% of the actual observations are outside of it?

UPDATE: Here are some notes on the provenance of this figure. Some aspects were were previously discussed here and here.

Synthesis Report SPM
Here is the figure from the Synthesis Report SPM about which Michael Smith inquired. (I’m just going to show the temperature portion in the trail.)

Figure SPM.1. Observed changes in (a) global average surface temperature; (b) global average sea level from tide gauge (blue) and satellite (red) data and (c) Northern Hemisphere snow cover for March-April. All differences are relative to corresponding averages for the period 1961-1990. Smoothed curves represent decadal averaged values while circles show yearly values. The shaded areas are the uncertainty intervals estimated from a comprehensive analysis of known uncertainties (a and b) and from the time series (c). {Figure 1.1}

I’ve bolded the phrase “comprehensive analysis of known uncertainties” so that we can separately track the provenance of this claim.

Synthesis Report
SYR Fig 1.1 is identical to SYR SPM Figure 1, including the caption, other than the citation, where Figure 1.1 is now:

{WGI FAQ 3.1 Figure 1, Figure 4.2 and Figure 5.13, Figure SPM.3}

It also contains the statements:

Eleven of the last twelve years (1995-2006) rank among the twelve warmest years in the instrumental record of global surface temperature (since 1850). The 100-year linear trend (1906-2005) of 0.74 [0.56 to 0.92]°C is larger than the corresponding trend trend of 0.6 [0.4 to 0.8]°C (1901-2000) given in the Third Assessment Report (TAR) (Figure 1.1). The linear warming trend over the 50 years 1956-2005 (0.13 [0.10 to 0.16]°C per decade) is nearly twice that for the 100 years 1906-2005. {WGI 3.2, SPM}

WG1 SPM
WG1 SPM Figure SPM-3 (excerpt shown below) is identical to SYR SPM Figure1.1. The provenance is again linked to the WG1 FAQ. The covering summary is for the most part identical to the SYR version, but also includes a statement that UHI effects have an effect of less than 0.006 deg per decade on land and zero over the ocean [ this latter claim is not necessarily true as there are complicated adjustments of SST to land measurements and the claimed non-impact of UHI on ocean needs to be confirmed]:

Eleven of the last twelve years (1995 -2006) rank among the 12 warmest years in the instrumental record of global surface temperature9 (since 1850). The updated 100-year linear trend (19062005) of 0.74 [0.56 to 0.92]°C is therefore larger than the corresponding trend for 1901-2000 given in the TAR of 0.6 [0.4 to 0.8]°C. The linear warming trend over the last 50 years (0.13 [0.10 to 0.16]°C per decade) is nearly twice that for the last 100 years. The total temperature increase from 1850  1899 to 2001  2005 is 0.76 [0.57 to 0.95]°C. Urban heat island effects are real but local, and have a negligible influence (less than 0.006°C per decade over land and zero over the oceans) on these values. {3.2}

WG1 WPM FIGURE SPM-3. Observed changes in (a) global average surface temperature; … All changes are relative to corresponding averages for the period 1961-1990. Smoothed curves represent decadal averaged values while circles show yearly values. The shaded areas are the uncertainty intervals estimated from a comprehensive analysis of known uncertainties (a and b) and from the time series (c). {FAQ 3.1, Figure 1, Figure 4.2 and Figure 5.13}

FAQ (Dec 2007 Version)
According to the Wayback Machine, the size of the FAQ has varied during 2007.

The summary for Topic 3.1 states:

Instrumental observations over the past 157 years show that temperatures at the surface have risen globally, with important regional variations. For the global average, warming in the last century has occurred in two phases, from the 1910s to the 1940s (0.35°C), and more strongly from the 1970s to the present (0.55°C). An increasing rate of warming has taken place over the last 25 years, and 11 of the 12 warmest years on record have occurred in the past 12 years. Above the surface, global observations since the late 1950s show that the troposphere (up to about 10 km) has warmed at a slightly greater rate than the surface, while the stratosphere (about 1030 km) has cooled markedly since 1979.

The running text states:

Expressed as a global average, surface temperatures have increased by about 0.74°C over the past hundred years (between 1906 and 2005; see Figure 1). However, the warming has been neither steady nor the same in different seasons or in different locations. There was not much overall change from 1850 to about 1915, aside from ups and downs associated with natural variability but which may have also partly arisen from poor sampling. An increase (0.35°C) occurred in the global average temperature from the 1910s to the 1940s, followed by a slight cooling (0.1°C), and then a rapid warming (0.55°C) up to the end of 2006 (Figure 1). The warmest years of the series are 1998 and 2005 (which are statistically indistinguishable), and 11 of the 12 warmest years have occurred in the last 12 years (1995 to 2006). Warming, particularly since the 1970s, has generally been greater over land than over the oceans. Seasonally, warming has been slightly greater in the winter hemisphere. Additional warming occurs in cities and urban areas (often referred to as the urban heat island effect), but is confined in spatial extent, and its effects are allowed for both by excluding as many of the affected sites as possible from the global temperature data and by increasing the error range (the blue band in the figure).

The FAQ figure is drawn differently than the figures referred to above as shown below. Note that the trend lines and trend periods in the SPMs are taken from this graphic. Also note that this figure does not contain any link to the WG1 Report itself.

WG1 FAQ 3.1, Figure 1. (Top) Annual global mean observed temperatures1 (black dots) along with simple fits to the data. The left hand axis shows anomalies relative to the 1961 to 1990 average and the right hand axis shows the estimated actual temperature (°C). Linear trend fits to the last 25 (yellow), 50 (orange), 100 (purple) and 150 years (red) are shown, and correspond to 1981 to 2005, 1956 to 2005, 1906 to 2005, and 1856 to 2005, respectively. Note that for shorter recent periods, the slope is greater, indicating accelerated warming. The blue curve is a smoothed depiction to capture the decadal variations. To give an idea of whether the fluctuations are meaningful, decadal 5% to 95% (light blue) error ranges about that line are given (accordingly, annual values do exceed those limits). Results from climate models driven by estimated radiative forcings for the 20th century (Chapter 9) suggest that there was little change prior to about 1915, and that a substantial fraction of the early 20th-century change was contributed by naturally occurring influences including solar radiation changes, volcanism and natural variability. From about 1940 to 1970 the increasing industrialisation following World War II increased pollution in the Northern Hemisphere, contributing to cooling, and increases in carbon dioxide and other greenhouse gases dominate the observed warming after the mid-1970s.

WG1 Chapter 3
The closest match that I could locate in the relevant WG1 chapter was Figure 3.6 shown below.

Figure 3.6. Global and hemispheric annual combined land-surface air temperature and SST anomalies (°C) (red) for 1850 to 2006 relative to the 1961 to 1990 mean, along with 5 to 95% error bar ranges, from HadCRUT3 (adapted from Brohan et al., 2006). The smooth blue curves show decadal variations (see Appendix 3.A).

The trends and trend periods in WG1 Chapter 3 are all slightly different than in the FAQ. In most professional reports, the numbers in the FAQ would track the numbers in the main report. But here the numbers have been re-calculated for some reason.

 WG1 Chapter 3 FAQ 1851-2005 0.042 1856-2005 0.045 1901-2005 0.071 1906-2005 0.076 1956-2005 0.128 1979-2005 0.163 1981-2005 0.177

Brohan et al 2006
The underlying provenance of these uncertainties is Brohan et al, 2006 about which UC has commented disapprovingly. The corresponding figure in that article appears to be:

Brohan et al 2006 Figure 10: HadCRUT3 global temperature anomaly time-series (C) at monthly (top), annual (centre), and smoothed annual (bottom) resolutions. The solid black line is the best estimate value, the red band gives the 95% uncertainty range caused by station, sampling and measurement errors; the green band adds the 95% error range due to limited coverage; and the blue band adds the 95% error range due to bias error

Trend Estimates in WG1

The trend estimates in WG1 Chapter 3 (and the SPM results appear to be calculated similarly) are from Table 3.2 which states:

Table 3.2. Linear trends in hemispheric and global land-surface air temperatures, SST (shown in table as HadSST2) and Nighttime Marine Air Temperature (NMAT; shown in table as HadMAT1). Annual averages, with estimates of uncertainties for CRU and HadSST2, were used to estimate trends. Trends with 5 to 95% confidence intervals and levels of significance (bold: <1%; italic, 15%) were estimated by Restricted Maximum Likelihood (REML; see Appendix 3.A), which allows for serial correlation (first order autoregression AR1) in the residuals of the data about the linear trend. The Durbin Watson D-statistic (not shown) for the residuals, after allowing for first-order serial correlation, never indicates significant positive serial correlation.

We’ve previously discussed at length the inappropriate use of the Durbin-Watson statistic and its ad hoc application to “rebut” concerns about long-term persistence expressed by review commenters here and here. The issue of persistence has come into play with the recent criticism by Foster (Tamino) and realclimate associates of Schwartz where they took what seems to be an opportunistically opposite view of long-term persistence to that taken for the purpose of IPCC Review Comments.

1. Michael Smith
Posted Dec 30, 2007 at 8:00 AM | Permalink

Since we are discussing uncertainty intervals, I have a question — probably a dumb one, but what the heck.

In the Summary for Policy Makers, IPCC 4th AR, there is a graph “Figure SPM.1” on page 3. See here: http://www.ipcc.ch/pdf/assessment-report/ar4/syr/ar4_syr_spm.pdf

The graph show “Global average surface temperature” since 1850. It includes individual data points for each year as well as an “uncertainty interval”. However, just looking at the graph, it appears that some 40+ years are outside this “uncertainty interval”. What does that “uncertainty interval” mean when 25% of the actual observations are outside of it?

2. bender
Posted Dec 30, 2007 at 8:33 AM | Permalink

That’s an excellent question, Michael.

One would have to drill down to see how this figure was constructed, because it is by no meand obvious or intuitive how the confidence envelope was created, or what the dots represent. There are many ways the uncertainty might have been estimated.

At times like this, don’t you wish you had a turnkey script you could run?

3. Posted Dec 30, 2007 at 8:54 AM | Permalink

At times like this, dont you wish you had a turnkey script you could run?

Not sure if this helps, but figure 3 of this script

makes something that looks similar (just add annual values to that one) . They assume white noise component, and thus smoothed uncertainty is smaller than annual uncertainty. Confusing figure anyway..

4. bender
Posted Dec 30, 2007 at 10:20 AM | Permalink

#85 Michael Smith:
UC #87 has it: the error bars are for the smoothed data, so they are unnaturally narrow. That is why the actual annual observations fall well outside the confidence envelope. See the chapter 3 FAQ Fig 3.1 in the technical document. It is far clearer than the SPM.

The reason smoothing is in some sense justified is that time series statistics and ensemble statistics are thought to be interchangeable (the ergodicity assumption). In other words, what happens over time hopefully tells you something about what could have happened over time. Smoothing is intended to take out the stochastic noise with the idea that you are getting a clearer view of the ensemble mean. That is the hope, anyways. Whether it is true is, I believe, an open question.

This probably sounds like mumbo jumbo. It all has to do with robustness of computed “trend”. To what degree is it external forcing signal by GHG *trend* vs internal stochastic *noise*? The redder the noise the more indistinguishable it is from trend. And lucia’s recent analysis suggests long-term persistence may be more of an issue than others suppose.

5. rafa
Posted Dec 30, 2007 at 11:59 AM | Permalink

Re: 85, sorry if I’m saying something stupid. Could it be what they call uncertainty interval is the standard error (values averaged)?. Then they only can grant 68% of the time the true value of the measured quantity falls within the uncertainty interval. Something like 30% of 150 yrs (about 40+) points might fall outside the uncertainty range. Is this a correct assumption?. Just guessing, the right answer is the authors fully explaining the method they used.

best

6. Michael Smith
Posted Dec 30, 2007 at 12:16 PM | Permalink

Bender and UC:

Thanks for answering my question in 85. And no, bender, what you wrote in 92 doesn’t sound like mumbo jumbo. It was well-explained. It is the kind of answer that teaches me something. Invaluable, really. Thanks again.

7. bender
Posted Dec 30, 2007 at 12:43 PM | Permalink

rafa’s wild guess #5 indicates exactly the value in having an audit trail for this stuff. It is by no means obvious how such calculations are done in climate science. Knowing that the 95% confidence interval is on the smoothed data doesn’t tell you much, really. How was the smoothing done? How was the envelope computed? Is it statistically robust? You need to know these things in order to really answer Michael’s excellent question: how is this graph to be interpreted?

We all know the earth is warming. The issue is how much, to what degree is it GHG-caused, will it continue, and if so, to what degree.

8. Paul29
Posted Dec 30, 2007 at 12:49 PM | Permalink

I understand that basing the error bars on the trend could explain the temperature graph (too many actual values outside the interval), but that does not explain the sea level rise graph (too few points outside the interval). (In a normal analysis using 90% confidence intervals, there should be about 15 points outside the intervals.) Is the trending analysis for sea level introducing greater uncertainty into the analysis? How can that be? The sea level rise trend looks very tight (especially compared to the global temperature data).

9. Brooks Hurd
Posted Dec 30, 2007 at 1:09 PM | Permalink

In my opinion, the surface temperature graph is an example of chosing a smoothing algorithm to produce the standard deviation that you want to show in your graph.

The sea level rise graph appears to be the result of cherry picking the data which is the closest to your pre-established trend. I have anlayzed a lot of emperical data, and in my opinion, the sea level graph is not based on emperical data. The fit is too good. Furthermore, the error does not seem to be related to the data which the authors selected for their graph.

10. bender
Posted Dec 30, 2007 at 1:18 PM | Permalink

Sea level, as a cumulative process, is not as noisy a process as annual GMT. Did you check AR4? Sea level *change* would be more volatile, more like GMT.

11. Michael Smith
Posted Dec 30, 2007 at 1:23 PM | Permalink

I’m beginning to see your point, bender. As advised, I went to Chapter 3 and read FAQ fig. 3.1. It offered the following clarification.

To give an idea of whether the fluctuations are meaningful, decadal 5% to 95% (light
grey) error ranges about that line are given (accordingly, annual values do exceed those limits).

However, I don’t see where they specify exactly how they calculated that “error range”.

They do make statements in that FAQ like this:

Additional warming occurs in cities and
urban areas (often referred to as the urban heat island effect), but
is confined in spatial extent, and its effects are allowed for both
by excluding as many of the affected sites as possible from the
global temperature data and by increasing the error range (the
blue band in the figure).

That statement surprised me. I thought the official position was that the UHI effect had been eliminated from the land record.
But evidently some amount of it remains, requiring an adjustment to the error calculation. Doesn’t this essentially express agreement with Dr. McKitrick’s recent study?

There is also, in the same FAQ, this statement:

Microwave satellite data have been used to create a satellite temperature record
for thick layers of the atmosphere including the troposphere
(from the surface up to about 10 km) and the lower stratosphere
(about 10 to 30 km). Despite several new analyses with improved
cross-calibration of the 13 instruments on different satellites used
since 1979 and compensation for changes in observing time and
satellite altitude, some uncertainties remain in trends.

Does that mean they’ve also adjusted the error range calculation for those “some uncertainties” remaining in the trends?

Surely, someone knows exactly what they did.

12. Posted Dec 30, 2007 at 1:51 PM | Permalink

Some old stuff, slightly related

http://www.climateaudit.org/?p=1276#comment-118888

http://www.climateaudit.org/?p=1681#comment-114704

I think the source for smoothed temp plot is

(caution: deeply flawed paper)

13. Arthur Smith
Posted Dec 30, 2007 at 3:19 PM | Permalink

To answer the question of what the uncertainty range means you could start by reading the figure caption:

The shaded areas are the uncertainty intervals estimated from a comprehensive analysis of known uncertainties

The shaded area is drawn around the smoothed (10-year) curve, but the same “analysis of known uncertainties” presumably applies to every circle as well. They could instead have drawn the smoothed curve and put error bars on all the circles, which would have been a somewhat more normal way of displaying it. In any case, the 25% number just means that about a quarter of the time (like in 1998) the average temperature in a given year differs in a statistically significant way from the 10-year average.

14. Posted Dec 30, 2007 at 4:26 PM | Permalink

They could instead have drawn the smoothed curve and put error bars on all the circles, which would have been a somewhat more normal way of displaying it.

Now this is getting silly. That wouldn’t be same thing at all. See for example

The biases (urbanisation, bucket correction etc.) have strong correlations in space and time, so they are just as large for decadal global averages as for monthly grid-point values.

Uncertainties for decadal and annual are different.

In any case, the 25% number just means that about a quarter of the time (like in 1998) the average temperature in a given year differs in a statistically significant way from the 10-year average.

And so 2006 temperature does not differ significantly from 1996-2016 average ? ( I think it is 21-point binomial )

I’m not prepared to pursue my line of inquiry any longer …

15. steven mosher
Posted Dec 30, 2007 at 8:08 PM | Permalink

re 13.

Here arthur, let’s start with something simple from Jones’ paper on the errors in temperatures.

From the paper cited by UC, we find the following.

“Measurement error (ob) The random error
0.2dC (1 ) [Folland et al., 2001]; the monthly
average will be based on at least two readings a
day throughout the month, giving 60 or more
values contributing to the mean. So the error
in the monthly average will be at most
0.2/sqrt(60) = 0.03dC and this will be uncorrelated
with the value for any other station or
the value for any other month.”

So, do you think this is an accurate description of the measurement error of a land station?

Now, mind you this is the simplest matter in all of the error analysis. Does Jones get it correct?

Explain why or why not.

16. Arthur Smith
Posted Dec 30, 2007 at 9:03 PM | Permalink

Steven – Jones’ explanation sounds correct for uncorrelated (random) errors. Uncorrelated errors are reduced over a large number of samples by a factor of sqrt(number of samples) – multiplying the 60 samples per month by 12 months and by the 5000 stations that’s a factor of over 2000, so even if the random error in an average single station is 2 degrees C, the resulting random error in the mean for a year would be only about 0.001 degrees, completely inconsequential. Averaging over ten years would reduce that to 0.0003 degrees, but either way it’s not a relevant measure of uncertainty.

The problem isn’t random measurement error, it’s systematic errors in the measuring instruments, “biases”, if you will. They’ve tried to estimate that; maybe they got it wrong, but it’s a standard approach (in high energy physics experiments they often publish a number with separate error estimates for random and systematic errors). Random errors can be estimated by looking at the standard deviation of a lot of samples, but systematic errors cannot, they can only be determined from an actual study of the reliability of the experimental techniques (and from comparison of measurements made with differing techniques).

Michael Smith’s question at the start seemed to stem from a misunderstanding of the time series of annual global mean temperatures as a collection of independent measurements of the same thing for which you could calculate a standard deviation. But it’s not, it’s a collection of data with distinct real values. This IPCC picture has tried to estimate systematic error levels (which would be the same numbers for annual or decadal averages) and that’s what you’re seeing in the shaded range, but it’s a completely separate number (and apparently slightly less) than the year-on-year standard deviation in global mean temperature.

17. chuck c
Posted Dec 30, 2007 at 9:21 PM | Permalink

If my understanding is correct, the SQRT(N) factor in the denominator used in the calculation of the standard error in the monthly average temperature is only valid if each of the samples (daily mean temperatures) is a prediction of the monthly mean, which they clearly are not. The math that connects that SQRT(N) factor to the standard error through the sample mean depends on the normal distribution of the daily measurements, and on the assumption that the daily measurements are a random variable. I don’t think either of these is true, making the Jones calculation not what would usually be called a standard error. i don’t know if it’s actually related to error at all.

I think there needs to be a much more sophisticated analysis here of the contribution of random error, even ignoring the possibilities for biasing errors in the methodology which could be much larger than the SE.

18. steven mosher
Posted Dec 30, 2007 at 9:27 PM | Permalink

RE 16. Arthur. How are observations made at land stations? Think very carefully. Reviewing an operators
guide would be a good start. Is it twice a day? or once a day? How many measurements are observed
at each observation? How is the observation of the measurements made? How are the observations of the measurements
recorded? Finally, How are the observations of the measurements manipulated into a final measurement?

Is jones correct?

19. steven mosher
Posted Dec 30, 2007 at 9:41 PM | Permalink

RE 16. Arthur. I doubt you will go find this out so I’ll just tell you. Jones asserts that there are two measurements
made per day and asserts that the standard error per measurement is .2C and asserts that the monthly error
is thus 2C/Sqrt(60)

Actually, it works like this. Once a day the observer goes out to the station to observe the instrument.

1. It is assumed that the time of observation is recorded correctly.
2. The observer inspects the instrument and records two values: one for Tmin, the other for Tmax.
3. These measures are correlated and not independent.
4. The observer makes a written record of the instrument recordings, ROUNDING the observations to the closest
degree.
5. The two rounded numbers are added together and divided by 2. This result is then rounded.
6. A monthly average is computed from the “30” daily figures ( tmax+tmin)/2

Calculate the monthly error

20. bender
Posted Dec 30, 2007 at 9:55 PM | Permalink

#16 misunderstanding? excuse me?

21. Arthur Smith
Posted Dec 30, 2007 at 10:07 PM | Permalink

As I said, the argument applies to uncorrelated errors. If things are correlated, if the measurement techniques are poor, etc. that could introduce systematic bias, sure. Is that the point?

Whether there’s a factor of sqrt(2) missing or not, that’s pretty much irrelevant because once you average over thousands of measurements, the random errors remaining are tiny. It’s the systematic errors that matter. I’m surprised you folks are having such a hard time getting the point – maybe I’m not explaining it well. Have you guys ever actually done random and systematic error calculations for real science experiments yourselves?

22. bender
Posted Dec 30, 2007 at 10:09 PM | Permalink

From the guy who can’t eyeball a pair of regression slopes.

23. steven mosher
Posted Dec 30, 2007 at 10:39 PM | Permalink

re 21. Arthur. I asked you a simple question. Was jones correct in the very first rudimentary
error calculation he did in his paper. I even gave you the hint to look at an operators guide.

Now, what was the point of me asking you that question. The point was this. Would you think and
investigate or wave your arms? That’s how bender and I will tell if you are serious or not and worth
the time to engage.

For 5 years I worked in military operations research. Wars and bombs and things that go boom.

This stuff:

24. Arthur Smith
Posted Dec 30, 2007 at 11:59 PM | Permalink

Guys, I was only trying to be helpful on this thread, answering the question posted up front. I don’t understand the attacks, and have no interest in whoever Jones is or what he did wrong. I’m done with this thread, sorry.

25. bender
Posted Dec 31, 2007 at 4:14 AM | Permalink

Have you guys ever actually done random and systematic error calculations for real science experiments yourselves?

What would be helpful would be a script that reproduces the figure in question. That would take out all guesswork.

If you can explain what was done, by all means, go ahead.

26. bender
Posted Dec 31, 2007 at 4:17 AM | Permalink

You distribute a thousand thermometers around the globe. Are you censusing the global mean temperature field, or sampling it? Are these replicated observations, or different observations? Or unreplicated sub-observations?

Answers here will affect what you plot in terms of “error bars” on annual GMT observations and what you can infer from them.

27. Pierre Gosselin
Posted Dec 31, 2007 at 5:56 AM | Permalink

I’d say it means that they (IPCC) want to appear to be much more sure than what the actual data warrant. It’s like the weatherman saying he’s quite sure it won’t rain, when actually there are a lot of dark thundery clouds approaching.
“We’re quite sure about the temperature, even though the data shows we can’t be.”

28. Pierre Gosselin
Posted Dec 31, 2007 at 6:06 AM | Permalink

Why does the seal level rise, seemingly at the average centurial rate, during the cooling period (ca. 1945 – 1970)?

29. Raven
Posted Dec 31, 2007 at 6:22 AM | Permalink

Why does the seal level rise, seemingly at the average centurial rate, during the cooling period (ca. 1945 – 1970)?

It slows down during cooling periods. See

30. Raven
Posted Dec 31, 2007 at 6:24 AM | Permalink

BTW. The sea level graph comes from a good report by Idsos that critiques Hansen’s latest claims.
The full report is here: http://www.co2science.org/scripts/CO2ScienceB2C/education/reports/hansen/hansen.jsp

31. Filippo Turturici
Posted Dec 31, 2007 at 8:00 AM | Permalink

I think we all miss a point: what is uncertainty.
Uncertainty is not a statistical question, nor just a random error, as IPCC or Jones seem suggest: uncertainty is how much precise we can go, or how much different two measures have to be not to be compatible (compatibility replaces equality in measurement).
It has a “solid” meaning, not just a mathematical one than can be corrected: if my thermometer has a 0.2°C error range, I can measure with a 0.2°C uncertainty and no less, all measures apparently more precise than 0.2°C are simply meaningless, and that’s all. Here uncertainty is a physical limit, not a calculus error.

So we have (about) these sources of uncertainty:
1- instrumental;
2- systematic errors;
3- random errors;
4- calculus errors.
Mathematical instruments, better informatical programming and higher CPU clock can correct to very low level just the last two kinds of errors.
Kind 1- can be corrected only by using more precise instruments (commercial scientifical thermometers have a 0.1-0.2°C uncertainty, so no way to get lower…).
Kind 2-, only by eliminating or correcting the sources of error (being human behaviour, enviroment, displacement etc.), often not just simply calculating them (anyway they should be analysed and calculated for any single instrument).

After we fix an uncertainty on our measurements, we can statistically analise them: but, Gauss curve or else is not at all an uncertainty, just a way to individuate probability (even, not itself a way to detect strange behaviours, having just 0.1% chance does not mean itself it has not to happen).

Thus, I think the real point is missing even if is simple, and that uncertainty is often misunterpreted and underestimated by IPCC et al.

32. Steve McIntyre
Posted Dec 31, 2007 at 10:10 AM | Permalink

I’ve added a long note on the provenance of SYR SPM Figure 1.

33. Bruce
Posted Dec 31, 2007 at 10:54 AM | Permalink

Isn’t the slope from 1910 to 1940 steeper than any other slope in the last 100 years?

SUV’s?

34. Mark T.
Posted Dec 31, 2007 at 10:57 AM | Permalink

War Jeeps.
Mark

35. steven mosher
Posted Dec 31, 2007 at 11:23 AM | Permalink

re 33..It’s very close to the present trend. I have the data here somewhere.

36. Phil.
Posted Dec 31, 2007 at 11:57 AM | Permalink

Re#25 & #26

Have you guys ever actually done random and systematic error calculations for real science experiments yourselves?

What would be helpful would be a script that reproduces the figure in question. That would take out all guesswork.

If you can explain what was done, by all means, go ahead.

You distribute a thousand thermometers around the globe. Are you censusing the global mean temperature field, or sampling it? Are these replicated observations, or different observations? Or unreplicated sub-observations?

Answers here will affect what you plot in terms of error bars on annual GMT observations and what you can infer from them.

37. bender
Posted Dec 31, 2007 at 11:57 AM | Permalink

Thanks for pulling this together with the graphs, Steve M.

“opportunistic view of long-term persistence” – remember that phrase.

38. Bruce
Posted Dec 31, 2007 at 12:03 PM | Permalink

Looking at the Sea Level Rise “rate”, I would suggest the modern period is the 3rd slowest rise compared to:

~1845-1875
~1910-1940

W

39. Steve McIntyre
Posted Dec 31, 2007 at 2:32 PM | Permalink

IPCC rejected any allowance for long-term persistence in their estimates of trend uncertainty as follows:

After already looking into this issue it is apparent that the Cohn and Lins method is likely wrong and misrepresents statistical significance by overestimating long term persistence. There is no known paper showing these are improved models.

Flash forward to Tamino, Schmidt et al on Schwartz where Schwartz argues that one of the implications of a short response time is a low climate sensitivity. Tamino et al argued that long-term persistence was important, that the response time was longer than Schwartz allowed for and there was a larger climate sensitivity.

Seems to me that there is typical sucking and blowing. If Tamino’s right, then the Review Comments by McKitrick and Cohn and Lins stand and the error bars on the trend are much larger than stated by IPCC. I wonder if Tamino’s going to point this out.

40. bender
Posted Dec 31, 2007 at 3:17 PM | Permalink

But, Steve M, help me out here. I thought the oceans had incredible “thermal inertia”. Is this not a source of “long-term persistence”? Is this not *exactly* what lucia/schwartz have tried recently to address? Is this not yet another case of “opportunisitic viewing”?

41. bender
Posted Dec 31, 2007 at 3:19 PM | Permalink

42. Posted Jan 1, 2008 at 10:46 AM | Permalink

Brohan annual smoothed with upper and lower 95% uncertainty ranges from the combined effects of all the uncertainties and Brohan annual (circles) looks like this:

Along with Fig SPM.3. it looks like this:

43. Posted Jan 1, 2008 at 10:47 AM | Permalink

(good match, biased another up for clarity)

44. steven mosher
Posted Jan 1, 2008 at 11:05 AM | Permalink

re 43. Also note UC that Hadley have made at least two adjustments to the Error since
IPCC publish date

45. Posted Jan 1, 2008 at 12:59 PM | Permalink

Arthur Smith

They could instead have drawn the smoothed curve and put error bars on all the circles, which would have been a somewhat more normal way of displaying it.

Something like this:

?

Steven,

and still this problem is in the data, non-zero bias uncertainty over the normal period.

46. Posted Jan 1, 2008 at 3:29 PM | Permalink

And if you want to now how the smoothing was done in Brohan et al, take 21 binomial coefficients ( F=diag(fliplr(pascal(21)));F=F/sum(F) 🙂 ), pad the series using end points to skip the fact that we don’t have future data, and then take running weighted mean using those coefficients. Brohan doesn’t explain this, this is trial and error result. This method is extremely sensitive to last data point, add some 0.1 C std noise to last point and this is what you’ll get with different realizations :

Have you guys ever actually done random and systematic error calculations for non-causal-filtered results without future data?

47. bender
Posted Jan 1, 2008 at 3:58 PM | Permalink

Yes. And that’s why I think the “error bars” on this curve are as meaningless as the ones on MBH99 recon. And as meaningless as the ones they put on GCM ensembles. Pseudostatistics. Just ask Wegman.

48. rafa
Posted Jan 2, 2008 at 3:43 AM | Permalink

Dear all, could someone summarize the answers to Michael’s question for dumbs like me?. Now we now where the figure comes from (Brohan et al 2006). We know that the FAQ figure does not contain any link to the WG1 Report itself (Steve emphasized this). Brohan et al used a particular smoothing method (critized by UC and Bender). I’m sorry for being that stupid but I still can’t see what’s the “physics” behind the uncertainty interval. What physically mean those points outside the shaded area?. Once everybody agrees every single line in the IPCC report where statistics appear have to be carefully scrutinized due to poor (being deliberate or not) information I still can’t answer Michael’s question, “What does that uncertainty interval mean?. Imagine the figure is the sales revenue per year. What happened those years where the sales revenue is outside the sahed area?. Thank you all. Have a nice 2008.

Steve: it’s not inconsistent with the WG1 data; it’s just that it’s been separately calculated. This is not the sort of thing that one anticipates in this sort of report. You would never see this in a summary report by a Canadian royal commission report. SEe the CAlinks in the thread for discussion of th uncertainty estimates – it’s a large topic.

49. Jean S
Posted Jan 2, 2008 at 7:34 AM | Permalink

UC (or others), sligthly off-topic: can you figure out how the “end smoothing” was done for this image (from here):

The actual smoothed curve (black) is probably done with 10 or 11-point Gaussian filter, but there is no explenation how the ends were obtained (nor did they bother to mark the ends with different color etc).

50. Mike B
Posted Jan 2, 2008 at 11:04 AM | Permalink

#49 I’d like to help out Jean, but my Finnish skills have weakened over the years. 🙂

Do you have a link to the data?

51. Posted Jan 3, 2008 at 4:23 AM | Permalink

rafa

Brohan et al used a particular smoothing method (critized by UC and Bender). Im sorry for being that stupid but I still cant see whats the physics behind the uncertainty interval.

I think Brohan uses model

$y=T+n$

where observation y (or combination of many obs.) is true temperature T plus noise n. Noise component n is further divided to bias-like and uncorrelated errors. Then Brohan applies a linear(*) filter F

$F(y)=F(T)+F(n)$

and recomputes CIs using F(n) (uncorrelated noise variance drops, bias-like remains the same). So far so good, now the reference is F(T) instead of T. Now, the problem is that F near the endpoints is different function than in the middle. You can put filter F in a matrix form to get matrix-vector multiplications

$Fy=FT+Fn$

First, assume that covariance matrix of n is identity matrix (and n is zero mean). You’ll get covariance matrix for smoothed noise F(n) by $FF^T$. It is easy to note that diagonal elements of this matrix are smaller in the middle (and that noise becomes correlated). This effect is not visible in Brohan’s data. Extreme smoothing case is to take

$F=X(X^TX)^{-1}X^T$

where first column of X is full of ones, and second is time (or vice versa). This is, of course, least squares line fit. And you’ll observe that $FF^T=F$, because F is symmetric projection matrix. And again, diagonal elements near the ends are larger than in the middle. But this is trivial for those who have plotted confidence limits for fitted regression lines.

Completely different approach would be to take a statistical model for T, and then apply statistical smoothing methods. But even in this case endpoint uncertainty would be larger, see
http://www.climateaudit.org/?p=1681#comment-114062

What physically mean those points outside the shaded area?

Circles are annual data, they are not related to shaded area (uncertainty of smoothed data). Apples to oranges comparison, that’s why I said this figure is confusing.

(*) it is linear even with the padding procedure

52. RomanM
Posted Jan 3, 2008 at 8:28 AM | Permalink

#49 Jean S:

Without seeing the data itself, it is hard to be particularly certain, but the behaviour of the smoothing at the endpoints looks to be consistent with a moving average method that I have preferred to use in the past. For each specific calculation, the series is padded with the value of the smoothed result itself. Thus, the padding value changes as you get nearer to the endpoint. In actual practice this is equivalent to truncating the weights used in the smoothing. For example, when using relative weights of 1-4-6-4-1, the second-last point uses 1-4-6-4 and the end point would use 1-4-6. This tends to favour the endpopints slightly, but does not usually give extreme results as seen in the MBH papers.

53. Steve McIntyre
Posted Jan 3, 2008 at 10:41 AM | Permalink

#49. Jean S, so you have a ref for the digital data?

54. rafa
Posted Jan 3, 2008 at 10:52 AM | Permalink

UC, thank you!.

55. Posted Jan 3, 2008 at 11:03 AM | Permalink

c51

Apples to oranges comparison, thats why I said this figure is confusing.

And if you want to see oranges to apples comparison, see Global surface temperatures over the past two millennia (2003) by Mann and Jones, Fig. 2c:

http://www.geocities.com/uc_edit/divergence.html

This seems to be decadal reconstruction, smoothed by 40 year (!) lowpass filter, but using uncertainties from ‘standard reconstructions’. In practice, Mann guesses annual temperature for 20 future years, and then makes a statement (my bold):

This warmth is, however, dwarfed by late 20th century warmth which is observed to be unprecedented at least as far back as AD 200.

Unbelievable, and then the story continues with response to Soon et al 04, On smoothing potentially non-stationary climate time series where he finally reveals how this smoothing was done.

Great work, Dr. Mann. I’ve collected a set of pictures for you, check them here

56. bender
Posted Jan 3, 2008 at 11:14 AM | Permalink

What does that uncertainty interval mean

What it really means and how the AR4 and SPM authors are trying to use it are probably two different things. They are trying to use it to suggest that the increasing trend is robust, is unlikely to occur by random chance alone. First, they are trying to suggest that the smoothed trend fits the actual annual data, which is why the dots and lines are plotted on the same graph. Second, they are trying to suggest that the trend in the line underlies an equal trend in the annual data (points), that instrumental error is not so large to preclude this inference.

What it really means – I don’t know. Putting the measuement error in with the sampling error makes no sense to me. You want to know how accurate your instruments are, sure. But as far as deducing a trend and attributing it to something like CO2, you would want a robust estimate of the stochastic (internal climatic) variability around that trend. Maybe ocean-caused long-term persistence would be such a large source of internal variability that this trend could occur by random chance alone? I don’t know. Myself, it is not clear that the inference they are trying to make is justified by the data. My advice is to ask Wegman.

57. Posted Jan 4, 2008 at 3:21 PM | Permalink

53, ref for the digital data

Some detective work

Finnish Meteorological Institute Contributions

Reliable estimation of climatic variations in Finland

Tuomenvirta, Heikki; Doctoral dissertation

Maybe this is somewhat related to the data. See Figure 5.4 along with this one:

Quite close. Table 5.1 tells that T4(adj) is based on 4 station data (Helsinki, Kuopio, Kajaani and Oulu). Can’t answer to end-point handling problem, though.. Interesting one is the Fig. 5.2.

Where this data is compared to Jones and Moberg (2003) Fennoscandia grid-average. Waldo is in Jones data, but not in the other (not implying anything, but there’s always something about ’30s ) ..

58. RomanM
Posted Jan 5, 2008 at 12:11 PM | Permalink

Nice sleuthing, UC! Although I am quite familiar with a Baltic language, it is not (anything like) Finnish.
Unfortunately, it does not seem to answer Jean S’s original question on the endpoint smoothing. In the thesis referenced in #57:

Reliable estimation of climatic variations in Finland
( https://oa.doria.fi/bitstream/handle/10024/2634/reliable.pdf?sequence=1 … I can’t seem to get the link tags working properly here today.)

on p. 23, the author explains in great detail about the Gaussian smoothing, referred to as G3 and G10, used in the thesis. When it comes to the endpoints, however, the author says:

The first (last) few values in the filtered series are mainly determined by the original data following (preceding) the year in question. The filtered values near the both ends of the time series must therefore be interpreted with some caution. The shape of the curves can change when new values are added.

This says absolutely nothing about the actual method used in dealing with the endpoint smoothing. What is the Finnish for “Where’s the data?!!”.

59. Posted Jan 8, 2008 at 1:05 PM | Permalink

RomanM,

This says absolutely nothing about the actual method used in dealing with the endpoint smoothing.

Seems to be a topic that is skipped very often in climate publications. These smoothed present-day ‘comparisons’ to ’30s and past millennium are quite disturbing.

What is the Finnish for Wheres the data?!!.

The data might be somewhere in here http://www.smhi.se/hfa_coord/nordklim/index.php?page=dataset . Too much work to find out.. But I’m sure the author knows English well, if someone wants to ask directly.

60. Mike Rankin
Posted Jan 8, 2008 at 3:24 PM | Permalink

Re: Smoothing

It seems to me that this question has been examined previously. I don’t know which thread but it may have been in late 2005 or early 2006. It seems that the “smoothing” for the end points was actually only an extension with the same slope as the last two normally calculated points. I believe that this was in context with some paper by Mann.

61. John Norris
Posted May 1, 2009 at 3:35 PM | Permalink

I found this article in April 20 Newsweek ‘In the Great Ship Titanic’ where new Secretary of DOE Steven Chu references the 2007 IPCC report.

Zakaria: Skeptics say there’s still conflicting evidence on global warming.
Chu: I urge everyone to do this: Google the 2007 IPCC report. The 100-year trend is unmistakable. The first thing to emphasize is don’t get excited about one or two years. It’s just like you should not get excited that one very bad hurricane is evidence there’s global warming.

So I followed his advice. Having read most of the (4) IPCC assessment reports and numerous CA threads such as this of course it was very familiar territory. The first and most prevalent 100 year trend I could find in the Summary for Policy Makers was the subject graphic for this thread. So the graphic in this thread, with all its weaknesses, appears to be the favorite substantiation for AGW for the new DOE Secretary. I doubt he is aware of the shortcomings, perhaps he needs a brief.