One of the Team’s more adventurous assumptions in creating temperature histories is that there was an abrupt and universal change in SST measurement methods away from buckets to engine inlets in 1941, coinciding with the U.S. entry into World War II. As a result, Folland et al introduced an abrupt adjustment of 0.3 deg C to all SST measurements prior to 1941 (with the amount of the adjustment attenuated in the 19th century because of a hypothesized use of wooden rather than canvas buckets.) At the time, James Hansen characterized these various adjustments as “ad hoc” and of “dubious validity” although his caveats seem to have been forgotten and the Folland adjustments have pretty much swept the field. To my knowledge, no climate scientist actually bothered trying to determine whether there was documentary evidence of this abrupt and sudden change in measurement methods. The assumption was simply asserted enough times and it came into general use.
This hypothesis has always seemed ludicrous to me ever since I became aware of it. As a result, I was very interested in the empirical study of the distribution of measurement methods illustrated in my post yesterday, showing that about 90% of SST measurements in 1970 for which the measurement method was known were still taken by buckets, despite the assumption by the Team that all measurements after 1941 were taken by engine inlet.
Barnett (1984) gave strong evidence that historical marine data are heterogeneous. He found a sudden jump around 1941 in the difference between SST and all-hours air temperatures reported largely by the same ships. Folland et al. (1984) explained this as being mainly a result of a sudden but undocumented change in the methods used to collect sea water to make measurements of SST. The methods were thought to have changed from the predominant use of canvas and other uninsulated buckets to the use of engine intakes. Anecdotal evidence from sea captains in the marine section of the Meteorological Office supported this idea. ..
The first quantitative corrections to SST data were tentatively indicated by Folland and Kates (1984), and were closely followed by those of Folland et al. (1984), who applied a constant positive correction of 0.3 deg. C to all data before April 1940, one of 0.25 deg. C to data between April 1940 and December 1941, and no correction thereafter
Folland and Parker justified the abrupt adjustment as follows:
The abrupt change in SST in December 1941 coincides with the entry of the USA into World War II and is likely to have resulted from a realization of the dangers of hauling sea buckets onto deck in wartime conditions when a light would have been needed for both hauling and reading the thermometer at night. The change was made possible by the widespread availability of engine inlet thermometers in 1941 (section 4)
Parker et al 1995, a companion article, describes the situation in similar terms:
Comparison with NMAT suggested that the change in instrumentation took place rather suddenly around the Second World War, so Folland et al. (1984) added 0.3 °C until early 1940, 0.25 °C thereafter through 1941, and nothing subsequently.
Hurrel and Trenberth 1999 describe the SST adjustment as follows:
An additional problem with SST is that it is not as well defined as is desirable. Historically, SST has referred to a bulk near-surface ocean temperature measured by tossing a bucket over the side of a ship in order to obtain a water sample. The design and insulation of the buckets has changed with time, however, so that corrections must be applied (Folland and Parker 1995). During World War II, moreover, there was a switch from bucket measurements to measuring the temperature of water taken on to cool the ship’s engines. These temperatures depend on the depth (3 to 7 or more) and size (10 to 51 cm in diameter) of the ship’s intake, the lading of the ship, the configuration of the engine room and the point where the measurement is taken. Such differences are responsible for some of the noise in the SST measurements, but biases also arise because heat from the engine room more than offsets any cold bias from the depth of the intake. Overall, the differences between engine intake and bucket temperatures is typically 0.3°C (see TCH for a more complete review).
The hypothesis of an abrupt and universal change in SST measurements seemed bizarre from the first time that I read these sentences. In my earlier post, I said:
The idea of an abrupt changeover seems a little weird to me. I’ve also seen reminiscences of an oceanographer [Stevenson] talking about taking measurements in a research ship with steel buckets in the 1950s, so I’m not sure how realistic this assumption is. If the changeover were phased in, it would presumably have a material impact on the SST history. It seems like an important enough issue that it shouldn’t be glossed over.
No climate scientist at the time seems to have bothered determining whether this hypothesis of sudden and universal change in measurement techniques could be substantiated in records. Now Kent et al 2007 have carried out a long overdue analysis of the metadata and reported that over 90% of SST measurements in 1970 for which the measurement method was known were still being carried out by bucket, as shown in the following figure. (While half of the measurement methods are unknown, I see no reason to assume that the distribution of measurement methods would differ materially from the very large sample for which measurement methods are known.
Figure from Kent et al 2007 showing SST measurement method.
The Folland and Parker hypothesis of abrupt and universal change in SST measurement methods in 1941 has been adopted in many data sets. For example, the British GOSTA Atlas 8 (an update of MOHSST6) states that:
The bucket corrections for the SST data are from Folland and Parker, 1995 (see Ref. 1). T
The MOHSST5 (Atlas 7) data, in which the Folland and Parker bucket adjustments are already embedded, was used in the Kaplan’s “optimal estimation” , which says:
Kaplan SST Description: This analysis uses present-day temperature patterns to enhance the meager data available in the past. Reduced Space Optimal Estimation has been applied to the global sea surface temperature (SST) record MOHSST5 (ATLAS7) from the U.K. Meteorological Office to produce 136 years of analyzed global SST anomalies (with regards to normals of 1951-1980) , where data gaps are removed and sampling errors are diminished.
Folland was a lead author of IPCC TAR. The coordinating lead authors of the section discussing Folland’s bucket adjustments are Trenberth and Jones. IPCC AR4 cites several references on buckets and several articles by lead author Kent, but does not discuss this important article. 4AR mentions buckets on no fewer than 10 occasions, saying:
A combined physical-empirical method (Folland and Parker, 1995) is mainly used, as reported in the TAR, to estimate adjustments to ship SST data obtained up to 1941 to compensate for heat losses from uninsulated (mainly canvas) or partly insulated (mainly wooden) buckets…..
recent studies have estimated all the known errors and biases to develop error bars (Brohan et al., 2006). For example, for SSTs, the transition from taking temperatures from water samples from uninsulated or partially-insulated buckets to engine intakes near or during World War II is adjusted for, even though details are not certain (Rayner et al., 2006).
…Owing to changes in instrumentation, observing environment and procedure, SSTs measured from modern ships and buoys are not consistent with those measured before the early 1940s using canvas or wooden buckets. SST measured by canvas buckets, in particular, generally cooled during the sampling process. Systematic adjustments are necessary (Folland and Parker, 1995; Smith and Reynolds, 2002; Rayner et al., 2006) to make the early data consistent with modern observations that have come from a mixture of buoys, engine inlets, hull sensors and insulated buckets. The adjustments are based on the physics of heat-transfer from the buckets (Folland and Parker, 1995) or on historical variations in the pattern of the annual amplitude of air-sea temperature differences in unadjusted data (Smith and Reynolds, 2002). The adjustments increased between the 1850s and 1940 because the fraction of canvas buckets increased and because ships moved faster, increasing the ventilation.
If 90% of known SST measurements in 1970 were still being made by buckets, then the most reasonable estimate for the entire population is that 90% of all SST measurements were still being made by buckets in 1970. This has a couple of implications. First, the adjustment for engine inlets needs to be phased in after 1970 rather than instantaneously in 1941. Dare one wonder whether some portion of the post-1970 increase in SST can be attributed to the increased proportion of engine inlet measurements evidenced in Kent et al?
Secondly, the hypothesis of an abrupt change in SST measurement methods was introduced in order to deal with some real phenomenon. If there was no abrupt and universal switch to engine inlet measurements, then whatever the phenomenon was remains unexplained.
It will be pretty easy to do a first-pass sensitivity analysis of an SST series in which the Pearl Harbor adjustment for introduction of engine inlet measurements is phased in after 1970 rather than in 1941, but it’s not too hard to picture the result.
UPDATE: Here is a first-pass analysis of the impact of a more plausible introduction of engine inlet measurements, as discussed in comments below (see especially #14, 19 and 55). Carl Smith commented:
Eyeballing Willis’s graph, and ignoring the red line, it looks to me like the WWII records were dominated by engine-warmed intake data, perhaps because the chaos meant much of the bucket data did not get recorded, and after WWII it was business as usual with mostly bucket data resuming.
Let’s suppose that Carl Smith’s idea is what happened. I did the same calculation assuming that 75% of all measurements from 1942-1945 were done by engine inlets, falling back to business as usual 10% in 1946 where it remained until 1970 when we have a measurement point – 90% of measurements in 1970 were still being made by buckets as indicated by the information in Kent et al 2007- and that the 90% phased down to 0 in 2000 linearly. This results in the following graphic:
Black – HadCRU version as archived; red- with phased implementation of engine inlet adjustment
Reference: D. E. Parker, C. K. Folland and M. Jackson, 1995, MARINE SURFACE TEMPERATURE: OBSERVED VARIATIONS AND DATA REQUIREMENTS, Climatic Change 31: 559-600 here
C. K. Folland and D. E. Parker, 1995, CORRECTION OF INSTRUMENTAL BIASES IN HISTORICAL SEA SURFACE TEMPERATURE DATA, Q.J.R. Meteorolol. Soc. 121, 319-367 here
Kent et al. 2007 url