IPCC Figure SPM.1

Michael Smith asks:

Since we are discussing uncertainty intervals, I have a question — probably a dumb one, but what the heck.

In the Summary for Policy Makers, IPCC 4th AR, there is a graph “Figure SPM.1” on page 3. See here: http://www.ipcc.ch/pdf/assessment-report/ar4/syr/ar4_syr_spm.pdf

The graph show “Global average surface temperature” since 1850. It includes individual data points for each year as well as an “uncertainty interval”. However, just looking at the graph, it appears that some 40+ years are outside this “uncertainty interval”. What does that “uncertainty interval” mean when 25% of the actual observations are outside of it?

UPDATE: Here are some notes on the provenance of this figure. Some aspects were were previously discussed here and here.

Synthesis Report SPM
Here is the figure from the Synthesis Report SPM about which Michael Smith inquired. (I’m just going to show the temperature portion in the trail.)
syr_fi40.jpg

Figure SPM.1. Observed changes in (a) global average surface temperature; (b) global average sea level from tide gauge (blue) and satellite (red) data and (c) Northern Hemisphere snow cover for March-April. All differences are relative to corresponding averages for the period 1961-1990. Smoothed curves represent decadal averaged values while circles show yearly values. The shaded areas are the uncertainty intervals estimated from a comprehensive analysis of known uncertainties (a and b) and from the time series (c). {Figure 1.1}

I’ve bolded the phrase “comprehensive analysis of known uncertainties” so that we can separately track the provenance of this claim.

Synthesis Report
SYR Fig 1.1 is identical to SYR SPM Figure 1, including the caption, other than the citation, where Figure 1.1 is now:

{WGI FAQ 3.1 Figure 1, Figure 4.2 and Figure 5.13, Figure SPM.3}

It also contains the statements:

Eleven of the last twelve years (1995-2006) rank among the twelve warmest years in the instrumental record of global surface temperature (since 1850). The 100-year linear trend (1906-2005) of 0.74 [0.56 to 0.92]°C is larger than the corresponding trend trend of 0.6 [0.4 to 0.8]°C (1901-2000) given in the Third Assessment Report (TAR) (Figure 1.1). The linear warming trend over the 50 years 1956-2005 (0.13 [0.10 to 0.16]°C per decade) is nearly twice that for the 100 years 1906-2005. {WGI 3.2, SPM}

WG1 SPM
WG1 SPM Figure SPM-3 (excerpt shown below) is identical to SYR SPM Figure1.1. The provenance is again linked to the WG1 FAQ. The covering summary is for the most part identical to the SYR version, but also includes a statement that UHI effects have an effect of less than 0.006 deg per decade on land and zero over the ocean [ this latter claim is not necessarily true as there are complicated adjustments of SST to land measurements and the claimed non-impact of UHI on ocean needs to be confirmed]:

Eleven of the last twelve years (1995 -2006) rank among the 12 warmest years in the instrumental record of global surface temperature9 (since 1850). The updated 100-year linear trend (1906–2005) of 0.74 [0.56 to 0.92]°C is therefore larger than the corresponding trend for 1901-2000 given in the TAR of 0.6 [0.4 to 0.8]°C. The linear warming trend over the last 50 years (0.13 [0.10 to 0.16]°C per decade) is nearly twice that for the last 100 years. The total temperature increase from 1850 – 1899 to 2001 – 2005 is 0.76 [0.57 to 0.95]°C. Urban heat island effects are real but local, and have a negligible influence (less than 0.006°C per decade over land and zero over the oceans) on these values. {3.2}

syr_fi47.jpg
WG1 WPM FIGURE SPM-3. Observed changes in (a) global average surface temperature; … All changes are relative to corresponding averages for the period 1961-1990. Smoothed curves represent decadal averaged values while circles show yearly values. The shaded areas are the uncertainty intervals estimated from a comprehensive analysis of known uncertainties (a and b) and from the time series (c). {FAQ 3.1, Figure 1, Figure 4.2 and Figure 5.13}

FAQ (Dec 2007 Version)
According to the Wayback Machine, the size of the FAQ has varied during 2007.

The summary for Topic 3.1 states:

Instrumental observations over the past 157 years show that temperatures at the surface have risen globally, with important regional variations. For the global average, warming in the last century has occurred in two phases, from the 1910s to the 1940s (0.35°C), and more strongly from the 1970s to the present (0.55°C). An increasing rate of warming has taken place over the last 25 years, and 11 of the 12 warmest years on record have occurred in the past 12 years. Above the surface, global observations since the late 1950s show that the troposphere (up to about 10 km) has warmed at a slightly greater rate than the surface, while the stratosphere (about 10–30 km) has cooled markedly since 1979.

The running text states:

Expressed as a global average, surface temperatures have increased by about 0.74°C over the past hundred years (between 1906 and 2005; see Figure 1). However, the warming has been neither steady nor the same in different seasons or in different locations. There was not much overall change from 1850 to about 1915, aside from ups and downs associated with natural variability but which may have also partly arisen from poor sampling. An increase (0.35°C) occurred in the global average temperature from the 1910s to the 1940s, followed by a slight cooling (0.1°C), and then a rapid warming (0.55°C) up to the end of 2006 (Figure 1). The warmest years of the series are 1998 and 2005 (which are statistically indistinguishable), and 11 of the 12 warmest years have occurred in the last 12 years (1995 to 2006). Warming, particularly since the 1970s, has generally been greater over land than over the oceans. Seasonally, warming has been slightly greater in the winter hemisphere. Additional warming occurs in cities and urban areas (often referred to as the urban heat island effect), but is confined in spatial extent, and its effects are allowed for both by excluding as many of the affected sites as possible from the global temperature data and by increasing the error range (the blue band in the figure).

The FAQ figure is drawn differently than the figures referred to above as shown below. Note that the trend lines and trend periods in the SPMs are taken from this graphic. Also note that this figure does not contain any link to the WG1 Report itself.

syr_fi42.jpg
WG1 FAQ 3.1, Figure 1. (Top) Annual global mean observed temperatures1 (black dots) along with simple fits to the data. The left hand axis shows anomalies relative to the 1961 to 1990 average and the right hand axis shows the estimated actual temperature (°C). Linear trend fits to the last 25 (yellow), 50 (orange), 100 (purple) and 150 years (red) are shown, and correspond to 1981 to 2005, 1956 to 2005, 1906 to 2005, and 1856 to 2005, respectively. Note that for shorter recent periods, the slope is greater, indicating accelerated warming. The blue curve is a smoothed depiction to capture the decadal variations. To give an idea of whether the fluctuations are meaningful, decadal 5% to 95% (light blue) error ranges about that line are given (accordingly, annual values do exceed those limits). Results from climate models driven by estimated radiative forcings for the 20th century (Chapter 9) suggest that there was little change prior to about 1915, and that a substantial fraction of the early 20th-century change was contributed by naturally occurring influences including solar radiation changes, volcanism and natural variability. From about 1940 to 1970 the increasing industrialisation following World War II increased pollution in the Northern Hemisphere, contributing to cooling, and increases in carbon dioxide and other greenhouse gases dominate the observed warming after the mid-1970s.

WG1 Chapter 3
The closest match that I could locate in the relevant WG1 chapter was Figure 3.6 shown below.

syr_fi45.gif
Figure 3.6. Global and hemispheric annual combined land-surface air temperature and SST anomalies (°C) (red) for 1850 to 2006 relative to the 1961 to 1990 mean, along with 5 to 95% error bar ranges, from HadCRUT3 (adapted from Brohan et al., 2006). The smooth blue curves show decadal variations (see Appendix 3.A).

The trends and trend periods in WG1 Chapter 3 are all slightly different than in the FAQ. In most professional reports, the numbers in the FAQ would track the numbers in the main report. But here the numbers have been re-calculated for some reason.

 WG1 Chapter 3  FAQ
 1851-2005  0.042  1856-2005  0.045
 1901-2005  0.071  1906-2005  0.076
     1956-2005  0.128
 1979-2005  0.163  1981-2005  0.177

Brohan et al 2006
The underlying provenance of these uncertainties is Brohan et al, 2006 about which UC has commented disapprovingly. The corresponding figure in that article appears to be:

syr_fi43.gif
Brohan et al 2006 Figure 10: HadCRUT3 global temperature anomaly time-series (C) at monthly (top), annual (centre), and smoothed annual (bottom) resolutions. The solid black line is the best estimate value, the red band gives the 95% uncertainty range caused by station, sampling and measurement errors; the green band adds the 95% error range due to limited coverage; and the blue band adds the 95% error range due to bias error

Trend Estimates in WG1

The trend estimates in WG1 Chapter 3 (and the SPM results appear to be calculated similarly) are from Table 3.2 which states:

Table 3.2. Linear trends in hemispheric and global land-surface air temperatures, SST (shown in table as HadSST2) and Nighttime Marine Air Temperature (NMAT; shown in table as HadMAT1). Annual averages, with estimates of uncertainties for CRU and HadSST2, were used to estimate trends. Trends with 5 to 95% confidence intervals and levels of significance (bold: <1%; italic, 1–5%) were estimated by Restricted Maximum Likelihood (REML; see Appendix 3.A), which allows for serial correlation (first order autoregression AR1) in the residuals of the data about the linear trend. The Durbin Watson D-statistic (not shown) for the residuals, after allowing for first-order serial correlation, never indicates significant positive serial correlation.

We’ve previously discussed at length the inappropriate use of the Durbin-Watson statistic and its ad hoc application to “rebut” concerns about long-term persistence expressed by review commenters here and here. The issue of persistence has come into play with the recent criticism by Foster (Tamino) and realclimate associates of Schwartz where they took what seems to be an opportunistically opposite view of long-term persistence to that taken for the purpose of IPCC Review Comments.

61 Comments

  1. Michael Smith
    Posted Dec 30, 2007 at 8:00 AM | Permalink

    Since we are discussing uncertainty intervals, I have a question — probably a dumb one, but what the heck.

    In the Summary for Policy Makers, IPCC 4th AR, there is a graph “Figure SPM.1” on page 3. See here: http://www.ipcc.ch/pdf/assessment-report/ar4/syr/ar4_syr_spm.pdf

    The graph show “Global average surface temperature” since 1850. It includes individual data points for each year as well as an “uncertainty interval”. However, just looking at the graph, it appears that some 40+ years are outside this “uncertainty interval”. What does that “uncertainty interval” mean when 25% of the actual observations are outside of it?

  2. bender
    Posted Dec 30, 2007 at 8:33 AM | Permalink

    That’s an excellent question, Michael.

    One would have to drill down to see how this figure was constructed, because it is by no meand obvious or intuitive how the confidence envelope was created, or what the dots represent. There are many ways the uncertainty might have been estimated.

    At times like this, don’t you wish you had a turnkey script you could run?

  3. Posted Dec 30, 2007 at 8:54 AM | Permalink

    At times like this, don’t you wish you had a turnkey script you could run?

    Not sure if this helps, but figure 3 of this script

    http://www.geocities.com/uc_edit/hadcr3.txt

    makes something that looks similar (just add annual values to that one) . They assume white noise component, and thus smoothed uncertainty is smaller than annual uncertainty. Confusing figure anyway..

  4. bender
    Posted Dec 30, 2007 at 10:20 AM | Permalink

    #85 Michael Smith:
    UC #87 has it: the error bars are for the smoothed data, so they are unnaturally narrow. That is why the actual annual observations fall well outside the confidence envelope. See the chapter 3 FAQ Fig 3.1 in the technical document. It is far clearer than the SPM.

    The reason smoothing is in some sense justified is that time series statistics and ensemble statistics are thought to be interchangeable (the ergodicity assumption). In other words, what happens over time hopefully tells you something about what could have happened over time. Smoothing is intended to take out the stochastic noise with the idea that you are getting a clearer view of the ensemble mean. That is the hope, anyways. Whether it is true is, I believe, an open question.

    This probably sounds like mumbo jumbo. It all has to do with robustness of computed “trend”. To what degree is it external forcing signal by GHG *trend* vs internal stochastic *noise*? The redder the noise the more indistinguishable it is from trend. And lucia’s recent analysis suggests long-term persistence may be more of an issue than others suppose.

  5. rafa
    Posted Dec 30, 2007 at 11:59 AM | Permalink

    Re: 85, sorry if I’m saying something stupid. Could it be what they call uncertainty interval is the standard error (values averaged)?. Then they only can grant 68% of the time the true value of the measured quantity falls within the uncertainty interval. Something like 30% of 150 yrs (about 40+) points might fall outside the uncertainty range. Is this a correct assumption?. Just guessing, the right answer is the authors fully explaining the method they used.

    best

  6. Michael Smith
    Posted Dec 30, 2007 at 12:16 PM | Permalink

    Bender and UC:

    Thanks for answering my question in 85. And no, bender, what you wrote in 92 doesn’t sound like mumbo jumbo. It was well-explained. It is the kind of answer that teaches me something. Invaluable, really. Thanks again.

  7. bender
    Posted Dec 30, 2007 at 12:43 PM | Permalink

    rafa’s wild guess #5 indicates exactly the value in having an audit trail for this stuff. It is by no means obvious how such calculations are done in climate science. Knowing that the 95% confidence interval is on the smoothed data doesn’t tell you much, really. How was the smoothing done? How was the envelope computed? Is it statistically robust? You need to know these things in order to really answer Michael’s excellent question: how is this graph to be interpreted?

    We all know the earth is warming. The issue is how much, to what degree is it GHG-caused, will it continue, and if so, to what degree.

  8. Paul29
    Posted Dec 30, 2007 at 12:49 PM | Permalink

    I understand that basing the error bars on the trend could explain the temperature graph (too many actual values outside the interval), but that does not explain the sea level rise graph (too few points outside the interval). (In a normal analysis using 90% confidence intervals, there should be about 15 points outside the intervals.) Is the trending analysis for sea level introducing greater uncertainty into the analysis? How can that be? The sea level rise trend looks very tight (especially compared to the global temperature data).

  9. Brooks Hurd
    Posted Dec 30, 2007 at 1:09 PM | Permalink

    In my opinion, the surface temperature graph is an example of chosing a smoothing algorithm to produce the standard deviation that you want to show in your graph.

    The sea level rise graph appears to be the result of cherry picking the data which is the closest to your pre-established trend. I have anlayzed a lot of emperical data, and in my opinion, the sea level graph is not based on emperical data. The fit is too good. Furthermore, the error does not seem to be related to the data which the authors selected for their graph.

  10. bender
    Posted Dec 30, 2007 at 1:18 PM | Permalink

    Sea level, as a cumulative process, is not as noisy a process as annual GMT. Did you check AR4? Sea level *change* would be more volatile, more like GMT.

  11. Michael Smith
    Posted Dec 30, 2007 at 1:23 PM | Permalink

    I’m beginning to see your point, bender. As advised, I went to Chapter 3 and read FAQ fig. 3.1. It offered the following clarification.

    To give an idea of whether the fluctuations are meaningful, decadal 5% to 95% (light
    grey) error ranges about that line are given (accordingly, annual values do exceed those limits).

    However, I don’t see where they specify exactly how they calculated that “error range”.

    They do make statements in that FAQ like this:

    Additional warming occurs in cities and
    urban areas (often referred to as the urban heat island effect), but
    is confined in spatial extent, and its effects are allowed for both
    by excluding as many of the affected sites as possible from the
    global temperature data and by increasing the error range (the
    blue band in the figure).

    That statement surprised me. I thought the official position was that the UHI effect had been eliminated from the land record.
    But evidently some amount of it remains, requiring an adjustment to the error calculation. Doesn’t this essentially express agreement with Dr. McKitrick’s recent study?

    There is also, in the same FAQ, this statement:

    Microwave satellite data have been used to create a ‘satellite temperature record’
    for thick layers of the atmosphere including the troposphere
    (from the surface up to about 10 km) and the lower stratosphere
    (about 10 to 30 km). Despite several new analyses with improved
    cross-calibration of the 13 instruments on different satellites used
    since 1979 and compensation for changes in observing time and
    satellite altitude, some uncertainties remain in trends.

    Does that mean they’ve also adjusted the error range calculation for those “some uncertainties” remaining in the trends?

    Surely, someone knows exactly what they did.

  12. Posted Dec 30, 2007 at 1:51 PM | Permalink

    Some old stuff, slightly related

    http://www.climateaudit.org/?p=1276#comment-118888

    http://www.climateaudit.org/?p=1681#comment-114704

    I think the source for smoothed temp plot is

    Click to access HadCRUT3_accepted.pdf

    (caution: deeply flawed paper)

    see also

    http://hadobs.metoffice.com/hadcrut3/diagnostics/global/nh+sh/

  13. Arthur Smith
    Posted Dec 30, 2007 at 3:19 PM | Permalink

    To answer the question of what the uncertainty range means you could start by reading the figure caption:

    The shaded areas are the uncertainty intervals estimated from a comprehensive analysis of known uncertainties

    The shaded area is drawn around the smoothed (10-year) curve, but the same “analysis of known uncertainties” presumably applies to every circle as well. They could instead have drawn the smoothed curve and put error bars on all the circles, which would have been a somewhat more normal way of displaying it. In any case, the 25% number just means that about a quarter of the time (like in 1998) the average temperature in a given year differs in a statistically significant way from the 10-year average.

  14. Posted Dec 30, 2007 at 4:26 PM | Permalink

    They could instead have drawn the smoothed curve and put error bars on all the circles, which would have been a somewhat more normal way of displaying it.

    Now this is getting silly. That wouldn’t be same thing at all. See for example
    http://hadobs.metoffice.com/hadcrut3/diagnostics/time-series.html :

    The biases (urbanisation, bucket correction etc.) have strong correlations in space and time, so they are just as large for decadal global averages as for monthly grid-point values.

    Uncertainties for decadal and annual are different.

    In any case, the 25% number just means that about a quarter of the time (like in 1998) the average temperature in a given year differs in a statistically significant way from the 10-year average.

    And so 2006 temperature does not differ significantly from 1996-2016 average ? ( I think it is 21-point binomial )

    I’m not prepared to pursue my line of inquiry any longer …

  15. steven mosher
    Posted Dec 30, 2007 at 8:08 PM | Permalink

    re 13.

    Here arthur, let’s start with something simple from Jones’ paper on the errors in temperatures.

    From the paper cited by UC, we find the following.

    “Measurement error (ob) The random error
    in a single thermometer reading is about
    0.2dC (1 ) [Folland et al., 2001]; the monthly
    average will be based on at least two readings a
    day throughout the month, giving 60 or more
    values contributing to the mean. So the error
    in the monthly average will be at most
    0.2/sqrt(60) = 0.03dC and this will be uncorrelated
    with the value for any other station or
    the value for any other month.”

    So, do you think this is an accurate description of the measurement error of a land station?

    Now, mind you this is the simplest matter in all of the error analysis. Does Jones get it correct?

    Explain why or why not.

    You may refer to land station operator instructions in your answer.

  16. Arthur Smith
    Posted Dec 30, 2007 at 9:03 PM | Permalink

    Steven – Jones’ explanation sounds correct for uncorrelated (random) errors. Uncorrelated errors are reduced over a large number of samples by a factor of sqrt(number of samples) – multiplying the 60 samples per month by 12 months and by the 5000 stations that’s a factor of over 2000, so even if the random error in an average single station is 2 degrees C, the resulting random error in the mean for a year would be only about 0.001 degrees, completely inconsequential. Averaging over ten years would reduce that to 0.0003 degrees, but either way it’s not a relevant measure of uncertainty.

    The problem isn’t random measurement error, it’s systematic errors in the measuring instruments, “biases”, if you will. They’ve tried to estimate that; maybe they got it wrong, but it’s a standard approach (in high energy physics experiments they often publish a number with separate error estimates for random and systematic errors). Random errors can be estimated by looking at the standard deviation of a lot of samples, but systematic errors cannot, they can only be determined from an actual study of the reliability of the experimental techniques (and from comparison of measurements made with differing techniques).

    Michael Smith’s question at the start seemed to stem from a misunderstanding of the time series of annual global mean temperatures as a collection of independent measurements of the same thing for which you could calculate a standard deviation. But it’s not, it’s a collection of data with distinct real values. This IPCC picture has tried to estimate systematic error levels (which would be the same numbers for annual or decadal averages) and that’s what you’re seeing in the shaded range, but it’s a completely separate number (and apparently slightly less) than the year-on-year standard deviation in global mean temperature.

  17. chuck c
    Posted Dec 30, 2007 at 9:21 PM | Permalink

    If my understanding is correct, the SQRT(N) factor in the denominator used in the calculation of the standard error in the monthly average temperature is only valid if each of the samples (daily mean temperatures) is a prediction of the monthly mean, which they clearly are not. The math that connects that SQRT(N) factor to the standard error through the sample mean depends on the normal distribution of the daily measurements, and on the assumption that the daily measurements are a random variable. I don’t think either of these is true, making the Jones calculation not what would usually be called a standard error. i don’t know if it’s actually related to error at all.

    I think there needs to be a much more sophisticated analysis here of the contribution of random error, even ignoring the possibilities for biasing errors in the methodology which could be much larger than the SE.

  18. steven mosher
    Posted Dec 30, 2007 at 9:27 PM | Permalink

    RE 16. Arthur. How are observations made at land stations? Think very carefully. Reviewing an operators
    guide would be a good start. Is it twice a day? or once a day? How many measurements are observed
    at each observation? How is the observation of the measurements made? How are the observations of the measurements
    recorded? Finally, How are the observations of the measurements manipulated into a final measurement?

    Is jones correct?

    read the operators manual.

  19. steven mosher
    Posted Dec 30, 2007 at 9:41 PM | Permalink

    RE 16. Arthur. I doubt you will go find this out so I’ll just tell you. Jones asserts that there are two measurements
    made per day and asserts that the standard error per measurement is .2C and asserts that the monthly error
    is thus 2C/Sqrt(60)

    Actually, it works like this. Once a day the observer goes out to the station to observe the instrument.

    1. It is assumed that the time of observation is recorded correctly.
    2. The observer inspects the instrument and records two values: one for Tmin, the other for Tmax.
    3. These measures are correlated and not independent.
    4. The observer makes a written record of the instrument recordings, ROUNDING the observations to the closest
    degree.
    5. The two rounded numbers are added together and divided by 2. This result is then rounded.
    6. A monthly average is computed from the “30” daily figures ( tmax+tmin)/2

    Calculate the monthly error

  20. bender
    Posted Dec 30, 2007 at 9:55 PM | Permalink

    #16 misunderstanding? excuse me?

  21. Arthur Smith
    Posted Dec 30, 2007 at 10:07 PM | Permalink

    As I said, the argument applies to uncorrelated errors. If things are correlated, if the measurement techniques are poor, etc. that could introduce systematic bias, sure. Is that the point?

    Whether there’s a factor of sqrt(2) missing or not, that’s pretty much irrelevant because once you average over thousands of measurements, the random errors remaining are tiny. It’s the systematic errors that matter. I’m surprised you folks are having such a hard time getting the point – maybe I’m not explaining it well. Have you guys ever actually done random and systematic error calculations for real science experiments yourselves?

  22. bender
    Posted Dec 30, 2007 at 10:09 PM | Permalink

    From the guy who can’t eyeball a pair of regression slopes.

  23. steven mosher
    Posted Dec 30, 2007 at 10:39 PM | Permalink

    re 21. Arthur. I asked you a simple question. Was jones correct in the very first rudimentary
    error calculation he did in his paper. I even gave you the hint to look at an operators guide.

    Now, what was the point of me asking you that question. The point was this. Would you think and
    investigate or wave your arms? That’s how bender and I will tell if you are serious or not and worth
    the time to engage.

    As to your question about doing error calculations for scientific experiements. The answer is Yes.
    For 5 years I worked in military operations research. Wars and bombs and things that go boom.

    This stuff:

    http://books.google.com/books?id=PUKzU6v96vUC&pg=PA112&lpg=PA112&dq=jmem+joint&source=web&ots=-9e4iiGbv2&sig=gnItuVtkEEKVdIgNDHb1Qwu7kxg

  24. Arthur Smith
    Posted Dec 30, 2007 at 11:59 PM | Permalink

    Guys, I was only trying to be helpful on this thread, answering the question posted up front. I don’t understand the attacks, and have no interest in whoever Jones is or what he did wrong. I’m done with this thread, sorry.

  25. bender
    Posted Dec 31, 2007 at 4:14 AM | Permalink

    Have you guys ever actually done random and systematic error calculations for real science experiments yourselves?

    This is your idea of being helpful?

    What would be helpful would be a script that reproduces the figure in question. That would take out all guesswork.

    If you can explain what was done, by all means, go ahead.

  26. bender
    Posted Dec 31, 2007 at 4:17 AM | Permalink

    You distribute a thousand thermometers around the globe. Are you censusing the global mean temperature field, or sampling it? Are these replicated observations, or different observations? Or unreplicated sub-observations?

    Answers here will affect what you plot in terms of “error bars” on annual GMT observations and what you can infer from them.

  27. Pierre Gosselin
    Posted Dec 31, 2007 at 5:56 AM | Permalink

    I’d say it means that they (IPCC) want to appear to be much more sure than what the actual data warrant. It’s like the weatherman saying he’s quite sure it won’t rain, when actually there are a lot of dark thundery clouds approaching.
    “We’re quite sure about the temperature, even though the data shows we can’t be.”

  28. Pierre Gosselin
    Posted Dec 31, 2007 at 6:06 AM | Permalink

    Why does the seal level rise, seemingly at the average centurial rate, during the cooling period (ca. 1945 – 1970)?

  29. Raven
    Posted Dec 31, 2007 at 6:22 AM | Permalink

    Why does the seal level rise, seemingly at the average centurial rate, during the cooling period (ca. 1945 – 1970)?

    It slows down during cooling periods. See

  30. Raven
    Posted Dec 31, 2007 at 6:24 AM | Permalink

    BTW. The sea level graph comes from a good report by Idsos that critiques Hansen’s latest claims.
    The full report is here: http://www.co2science.org/scripts/CO2ScienceB2C/education/reports/hansen/hansen.jsp

  31. Filippo Turturici
    Posted Dec 31, 2007 at 8:00 AM | Permalink

    I think we all miss a point: what is uncertainty.
    Uncertainty is not a statistical question, nor just a random error, as IPCC or Jones seem suggest: uncertainty is how much precise we can go, or how much different two measures have to be not to be compatible (compatibility replaces equality in measurement).
    It has a “solid” meaning, not just a mathematical one than can be corrected: if my thermometer has a 0.2°C error range, I can measure with a 0.2°C uncertainty and no less, all measures apparently more precise than 0.2°C are simply meaningless, and that’s all. Here uncertainty is a physical limit, not a calculus error.

    So we have (about) these sources of uncertainty:
    1- instrumental;
    2- systematic errors;
    3- random errors;
    4- calculus errors.
    Mathematical instruments, better informatical programming and higher CPU clock can correct to very low level just the last two kinds of errors.
    Kind 1- can be corrected only by using more precise instruments (commercial scientifical thermometers have a 0.1-0.2°C uncertainty, so no way to get lower…).
    Kind 2-, only by eliminating or correcting the sources of error (being human behaviour, enviroment, displacement etc.), often not just simply calculating them (anyway they should be analysed and calculated for any single instrument).

    After we fix an uncertainty on our measurements, we can statistically analise them: but, Gauss curve or else is not at all an uncertainty, just a way to individuate probability (even, not itself a way to detect strange behaviours, having just 0.1% chance does not mean itself it has not to happen).

    Thus, I think the real point is missing even if is simple, and that uncertainty is often misunterpreted and underestimated by IPCC et al.

  32. Steve McIntyre
    Posted Dec 31, 2007 at 10:10 AM | Permalink

    I’ve added a long note on the provenance of SYR SPM Figure 1.

  33. Bruce
    Posted Dec 31, 2007 at 10:54 AM | Permalink

    Isn’t the slope from 1910 to 1940 steeper than any other slope in the last 100 years?

    SUV’s?

  34. Mark T.
    Posted Dec 31, 2007 at 10:57 AM | Permalink

    War Jeeps.
    Mark

  35. steven mosher
    Posted Dec 31, 2007 at 11:23 AM | Permalink

    re 33..It’s very close to the present trend. I have the data here somewhere.

  36. Phil.
    Posted Dec 31, 2007 at 11:57 AM | Permalink

    Re#25 & #26

    Have you guys ever actually done random and systematic error calculations for real science experiments yourselves?

    This is your idea of being helpful?

    What would be helpful would be a script that reproduces the figure in question. That would take out all guesswork.

    If you can explain what was done, by all means, go ahead.

    You distribute a thousand thermometers around the globe. Are you censusing the global mean temperature field, or sampling it? Are these replicated observations, or different observations? Or unreplicated sub-observations?

    Answers here will affect what you plot in terms of “error bars” on annual GMT observations and what you can infer from them.

    So you’re good at asking questions, what’s your answer to the one you asked about the mean temp? Out by a factor of 2?

  37. bender
    Posted Dec 31, 2007 at 11:57 AM | Permalink

    Thanks for pulling this together with the graphs, Steve M.

    “opportunistic view of long-term persistence” – remember that phrase.

  38. Bruce
    Posted Dec 31, 2007 at 12:03 PM | Permalink

    Looking at the Sea Level Rise “rate”, I would suggest the modern period is the 3rd slowest rise compared to:

    ~1845-1875
    ~1910-1940

    W

  39. Steve McIntyre
    Posted Dec 31, 2007 at 2:32 PM | Permalink

    IPCC rejected any allowance for long-term persistence in their estimates of trend uncertainty as follows:

    After already looking into this issue it is apparent that the Cohn and Lins method is likely wrong and misrepresents statistical significance by overestimating long term persistence. There is no known paper showing these are improved models.

    Flash forward to Tamino, Schmidt et al on Schwartz where Schwartz argues that one of the implications of a short response time is a low climate sensitivity. Tamino et al argued that long-term persistence was important, that the response time was longer than Schwartz allowed for and there was a larger climate sensitivity.

    Seems to me that there is typical sucking and blowing. If Tamino’s right, then the Review Comments by McKitrick and Cohn and Lins stand and the error bars on the trend are much larger than stated by IPCC. I wonder if Tamino’s going to point this out.

  40. bender
    Posted Dec 31, 2007 at 3:17 PM | Permalink

    But, Steve M, help me out here. I thought the oceans had incredible “thermal inertia”. Is this not a source of “long-term persistence”? Is this not *exactly* what lucia/schwartz have tried recently to address? Is this not yet another case of “opportunisitic viewing”?

  41. bender
    Posted Dec 31, 2007 at 3:19 PM | Permalink

    Whoops, heh heh, didn’t read your last sentence.

  42. Posted Jan 1, 2008 at 10:46 AM | Permalink

    Brohan annual smoothed with upper and lower 95% uncertainty ranges from the combined effects of all the uncertainties and Brohan annual (circles) looks like this:

    Along with Fig SPM.3. it looks like this:

  43. Posted Jan 1, 2008 at 10:47 AM | Permalink

    (good match, biased another up for clarity)

  44. steven mosher
    Posted Jan 1, 2008 at 11:05 AM | Permalink

    re 43. Also note UC that Hadley have made at least two adjustments to the Error since
    IPCC publish date

  45. Posted Jan 1, 2008 at 12:59 PM | Permalink

    Arthur Smith

    They could instead have drawn the smoothed curve and put error bars on all the circles, which would have been a somewhat more normal way of displaying it.

    Something like this:

    ?

    Steven,

    Also note UC that Hadley have made at least two adjustments to the Error since

    and still this problem is in the data, non-zero bias uncertainty over the normal period.

  46. Posted Jan 1, 2008 at 3:29 PM | Permalink

    And if you want to now how the smoothing was done in Brohan et al, take 21 binomial coefficients ( F=diag(fliplr(pascal(21)));F=F/sum(F) 🙂 ), pad the series using end points to skip the fact that we don’t have future data, and then take running weighted mean using those coefficients. Brohan doesn’t explain this, this is trial and error result. This method is extremely sensitive to last data point, add some 0.1 C std noise to last point and this is what you’ll get with different realizations :

    Have you guys ever actually done random and systematic error calculations for non-causal-filtered results without future data?

  47. bender
    Posted Jan 1, 2008 at 3:58 PM | Permalink

    Yes. And that’s why I think the “error bars” on this curve are as meaningless as the ones on MBH99 recon. And as meaningless as the ones they put on GCM ensembles. Pseudostatistics. Just ask Wegman.

  48. rafa
    Posted Jan 2, 2008 at 3:43 AM | Permalink

    Dear all, could someone summarize the answers to Michael’s question for dumbs like me?. Now we now where the figure comes from (Brohan et al 2006). We know that the FAQ figure does not contain any link to the WG1 Report itself (Steve emphasized this). Brohan et al used a particular smoothing method (critized by UC and Bender). I’m sorry for being that stupid but I still can’t see what’s the “physics” behind the uncertainty interval. What physically mean those points outside the shaded area?. Once everybody agrees every single line in the IPCC report where statistics appear have to be carefully scrutinized due to poor (being deliberate or not) information I still can’t answer Michael’s question, “What does that uncertainty interval mean?. Imagine the figure is the sales revenue per year. What happened those years where the sales revenue is outside the sahed area?. Thank you all. Have a nice 2008.

    Steve: it’s not inconsistent with the WG1 data; it’s just that it’s been separately calculated. This is not the sort of thing that one anticipates in this sort of report. You would never see this in a summary report by a Canadian royal commission report. SEe the CAlinks in the thread for discussion of th uncertainty estimates – it’s a large topic.

  49. Jean S
    Posted Jan 2, 2008 at 7:34 AM | Permalink

    UC (or others), sligthly off-topic: can you figure out how the “end smoothing” was done for this image (from here):

    The actual smoothed curve (black) is probably done with 10 or 11-point Gaussian filter, but there is no explenation how the ends were obtained (nor did they bother to mark the ends with different color etc).

  50. Mike B
    Posted Jan 2, 2008 at 11:04 AM | Permalink

    #49 I’d like to help out Jean, but my Finnish skills have weakened over the years. 🙂

    Do you have a link to the data?

  51. Posted Jan 3, 2008 at 4:23 AM | Permalink

    rafa

    Brohan et al used a particular smoothing method (critized by UC and Bender). I’m sorry for being that stupid but I still can’t see what’s the “physics” behind the uncertainty interval.

    I think Brohan uses model

    y=T+n

    where observation y (or combination of many obs.) is true temperature T plus noise n. Noise component n is further divided to bias-like and uncorrelated errors. Then Brohan applies a linear(*) filter F

    F(y)=F(T)+F(n)

    and recomputes CIs using F(n) (uncorrelated noise variance drops, bias-like remains the same). So far so good, now the reference is F(T) instead of T. Now, the problem is that F near the endpoints is different function than in the middle. You can put filter F in a matrix form to get matrix-vector multiplications

    Fy=FT+Fn

    First, assume that covariance matrix of n is identity matrix (and n is zero mean). You’ll get covariance matrix for smoothed noise F(n) by FF^T . It is easy to note that diagonal elements of this matrix are smaller in the middle (and that noise becomes correlated). This effect is not visible in Brohan’s data. Extreme smoothing case is to take

    F=X(X^TX)^{-1}X^T

    where first column of X is full of ones, and second is time (or vice versa). This is, of course, least squares line fit. And you’ll observe that FF^T=F , because F is symmetric projection matrix. And again, diagonal elements near the ends are larger than in the middle. But this is trivial for those who have plotted confidence limits for fitted regression lines.

    Completely different approach would be to take a statistical model for T, and then apply statistical smoothing methods. But even in this case endpoint uncertainty would be larger, see
    http://www.climateaudit.org/?p=1681#comment-114062

    What physically mean those points outside the shaded area?

    Circles are annual data, they are not related to shaded area (uncertainty of smoothed data). Apples to oranges comparison, that’s why I said this figure is confusing.

    (*) it is linear even with the padding procedure

  52. RomanM
    Posted Jan 3, 2008 at 8:28 AM | Permalink

    #49 Jean S:

    Without seeing the data itself, it is hard to be particularly certain, but the behaviour of the smoothing at the endpoints looks to be consistent with a moving average method that I have preferred to use in the past. For each specific calculation, the series is padded with the value of the smoothed result itself. Thus, the padding value changes as you get nearer to the endpoint. In actual practice this is equivalent to truncating the weights used in the smoothing. For example, when using relative weights of 1-4-6-4-1, the second-last point uses 1-4-6-4 and the end point would use 1-4-6. This tends to favour the endpopints slightly, but does not usually give extreme results as seen in the MBH papers.

  53. Steve McIntyre
    Posted Jan 3, 2008 at 10:41 AM | Permalink

    #49. Jean S, so you have a ref for the digital data?

  54. rafa
    Posted Jan 3, 2008 at 10:52 AM | Permalink

    UC, thank you!.

  55. Posted Jan 3, 2008 at 11:03 AM | Permalink

    c51

    Apples to oranges comparison, that’s why I said this figure is confusing.

    And if you want to see oranges to apples comparison, see Global surface temperatures over the past two millennia (2003) by Mann and Jones, Fig. 2c:

    http://www.geocities.com/uc_edit/divergence.html

    This seems to be decadal reconstruction, smoothed by 40 year (!) lowpass filter, but using uncertainties from ‘standard reconstructions’. In practice, Mann guesses annual temperature for 20 future years, and then makes a statement (my bold):

    This warmth is, however, dwarfed by late 20th century warmth which is observed to be unprecedented at least as far back as AD 200.

    Unbelievable, and then the story continues with response to Soon et al 04, On smoothing potentially non-stationary climate time series where he finally reveals how this smoothing was done.

    Great work, Dr. Mann. I’ve collected a set of pictures for you, check them here

  56. bender
    Posted Jan 3, 2008 at 11:14 AM | Permalink

    #48 asks

    What does that uncertainty interval mean

    What it really means and how the AR4 and SPM authors are trying to use it are probably two different things. They are trying to use it to suggest that the increasing trend is robust, is unlikely to occur by random chance alone. First, they are trying to suggest that the smoothed trend fits the actual annual data, which is why the dots and lines are plotted on the same graph. Second, they are trying to suggest that the trend in the line underlies an equal trend in the annual data (points), that instrumental error is not so large to preclude this inference.

    What it really means – I don’t know. Putting the measuement error in with the sampling error makes no sense to me. You want to know how accurate your instruments are, sure. But as far as deducing a trend and attributing it to something like CO2, you would want a robust estimate of the stochastic (internal climatic) variability around that trend. Maybe ocean-caused long-term persistence would be such a large source of internal variability that this trend could occur by random chance alone? I don’t know. Myself, it is not clear that the inference they are trying to make is justified by the data. My advice is to ask Wegman.

  57. Posted Jan 4, 2008 at 3:21 PM | Permalink

    53, ref for the digital data

    Some detective work

    Finnish Meteorological Institute Contributions

    Reliable estimation of climatic variations in Finland

    Tuomenvirta, Heikki; Doctoral dissertation

    Maybe this is somewhat related to the data. See Figure 5.4 along with this one:

    Quite close. Table 5.1 tells that T4(adj) is based on 4 station data (Helsinki, Kuopio, Kajaani and Oulu). Can’t answer to end-point handling problem, though.. Interesting one is the Fig. 5.2.

    Where this data is compared to Jones and Moberg (2003) Fennoscandia grid-average. Waldo is in Jones data, but not in the other (not implying anything, but there’s always something about ’30s ) ..

  58. RomanM
    Posted Jan 5, 2008 at 12:11 PM | Permalink

    Nice sleuthing, UC! Although I am quite familiar with a Baltic language, it is not (anything like) Finnish.
    Unfortunately, it does not seem to answer Jean S’s original question on the endpoint smoothing. In the thesis referenced in #57:

    Reliable estimation of climatic variations in Finland
    ( https://oa.doria.fi/bitstream/handle/10024/2634/reliable.pdf?sequence=1 … I can’t seem to get the link tags working properly here today.)

    on p. 23, the author explains in great detail about the Gaussian smoothing, referred to as G3 and G10, used in the thesis. When it comes to the endpoints, however, the author says:

    The first (last) few values in the filtered series are mainly determined by the original data following (preceding) the year in question. The filtered values near the both ends of the time series must therefore be interpreted with some caution. The shape of the curves can change when new values are added.

    This says absolutely nothing about the actual method used in dealing with the endpoint smoothing. What is the Finnish for “Where’s the data?!!”.

  59. Posted Jan 8, 2008 at 1:05 PM | Permalink

    RomanM,

    This says absolutely nothing about the actual method used in dealing with the endpoint smoothing.

    Seems to be a topic that is skipped very often in climate publications. These smoothed present-day ‘comparisons’ to ’30s and past millennium are quite disturbing.

    What is the Finnish for “Where’s the data?!!”.

    The data might be somewhere in here http://www.smhi.se/hfa_coord/nordklim/index.php?page=dataset . Too much work to find out.. But I’m sure the author knows English well, if someone wants to ask directly.

  60. Mike Rankin
    Posted Jan 8, 2008 at 3:24 PM | Permalink

    Re: Smoothing

    It seems to me that this question has been examined previously. I don’t know which thread but it may have been in late 2005 or early 2006. It seems that the “smoothing” for the end points was actually only an extension with the same slope as the last two normally calculated points. I believe that this was in context with some paper by Mann.

  61. John Norris
    Posted May 1, 2009 at 3:35 PM | Permalink

    I found this article in April 20 Newsweek ‘In the Great Ship Titanic’ where new Secretary of DOE Steven Chu references the 2007 IPCC report.

    Zakaria: Skeptics say there’s still conflicting evidence on global warming.
    Chu: I urge everyone to do this: Google the 2007 IPCC report. The 100-year trend is unmistakable. The first thing to emphasize is don’t get excited about one or two years. It’s just like you should not get excited that one very bad hurricane is evidence there’s global warming.

    So I followed his advice. Having read most of the (4) IPCC assessment reports and numerous CA threads such as this of course it was very familiar territory. The first and most prevalent 100 year trend I could find in the Summary for Policy Makers was the subject graphic for this thread. So the graphic in this thread, with all its weaknesses, appears to be the favorite substantiation for AGW for the new DOE Secretary. I doubt he is aware of the shortcomings, perhaps he needs a brief.