In May I began a quest to better understand how GISS does its homogeneity adjustment, also known as GISS Step 2. Steve McIntyre took the ball from that scrum and ran with it, producing a set of R tools that nearly replicate the GISS method. Some of the endpoint cases continue to confound those of us trying to understand the source code and how it reconciles – or doesn’t – with the peer-reviewed literature.
As this was going on, Anthony Watts pinged me several times, asking that I look at the Cedarville, CA adjustment to better understand why GISS would apply an urban adjustment to an obviously rural station, a topic which he explored in a previous post. I hesitated, because Cedarville had a lot of “nearby” (as defined by GISS) rural stations, and I wanted something simpler to look at. However, I did not forget his request and I took occasional peeks at the station and its neighbors. Below is an overall site view of the Cedarville station. GISS assigns a “night lights” value of 2 to this station, which is what causes it to go through the homogenization process.
Here is a Google Earth image of Cedarville and the surrounding area with the NASA City Lights image overlay enabled. I am not sure what the NASA sensors are picking up to assign Cedarville the “2″ rating.
Anthony says this in his post about Cedarville: “a place with a good record and little in the way of station moves”. Generally this may be true, but I personally am suspicious about the fidelity of a station’s record when I see Batman lurking in the 1930′s:
OK, let’s assume for the moment that Cedarville’s record is beyond reproach. Let’s further assume that the Cedarville station is urban, and is cursed with the typical frailties of an urban station: lots of asphalt, little vegetation, and placement near an air-conditioner, strip mall, or jet engine. Certainly the surrounding urban stations are of such pristine fidelity that they can be used to remove the urban noise from Cedarville. Let’s take a closer look at those stations and that homogeneity adjustment.
Below is a Google Earth view of all of the rural stations within 500km of Cedarville that are used to completely determine the station’s urban homogeneity adjustment. The Oregon stations seem to be well-represented:
In the next plot I have color-coded the markers to reflect each station’s trend: white is neutral, reds are warming, blues are cooling (the darker the color, the greater the trend). Unfortunately, the red circle I used to indicate the 500km radius gives the white markers a red cast. Orleans and Electra stick out like sore thumbs with sharp cooling trends, while other stations generally exhibit a flat or warming trend.
I then went through the GISS Step 2 process of combining the rural stations into a single “rural record”. This is done by starting with the station with the longest record – Golconda – and combining the remaining stations one by one from longest to shortest record. Without going deep into the details, each station is first adjusted (biased) such that it’s mean matches the mean of the combined record. Then, the station’s record is averaged in with the combined record using a weight that decreases linearly with distance from the urban station at the center (in this case, of course, the metropolis of Cedarville).
The next plot compares the difference between the Golconda record and the final combined rural record. While Golconda has an influence on the final record, it is does not appear overwhelming.
I did notice a big difference when the fourth station, Orleans, was combined (Mina is the second and Willows the third). Comparing the difference between Orleans and the final combined rural record, I saw the slope go slightly negative but close to zero, and the extrema pull in much closer to the final value. I am not sure how to determine which station of the 29 has the greatest influence on the combined record, but my instinct is telling me Orleans is the one.
Here is a comparison of combined Golconda, Mina, Willows, and Orleans record with that of all 29 stations combined. Clearly the first four rural stations of the 29 get us very close to the final solution:
The next part of GISS step 2 takes the difference between the Cedarville record and that of the combined rural stations. Following is a plot comparing those two records:
The difference between Cedarville and the combined rurals is shown in the next plot. Also shown is the adjustment value that GISS calculates from the difference. I would have expected an adjustment that looked less like a lower envelope value and more like an average value. The adjustment result indicates all values before 1910 should be adjusted upward and all values after 1910 should be adjusted downward.
The adjustment shown above in red is then added back into the Cedarville record to produce the homogenized result. The next plot compares the homogenized version of Cedarville with the original. It clearly shows that values before 1910 are adjusted up, and values since are adjusted down.
So what do I make of all this? In the simplest terms, I see Orleans having a rather large influence on the adjustment made to Cedarville. Should Cedarville be adjusted? Well, it certainly is not urban, so the standard GISS urban adjustment seems inappropriate. But the fact that Batman lurks in the 1930s indicates to me that something is amiss and needs to be (as Delbert Grady says in The Shining) “corrected”.
Is Orleans an appropriate adjuster? Certainly it is rural, and the station history indicates it has not moved. However, when I look at the plot of the Orleans data, I see something happened around 1929, and my guess is that it was not sudden global cooling:
I don’t think this is necessarily a situation where garbage in equals garbage out. Rather, I think it is a situation in which a bunch of trimmings are thrown together and mixed to produce a kind an adjustment sausage. It is not necessarily something that accurately reflects the initial ingredients (inputs), but the output sure is tasty, especially after it has been cooked for a while.