In addition, that is not what is usually meant by a statement such as “5 ⯠0.1”, which almost always means a value of 5, with one standard error being 0.1.

Your other comments (about pressure, lack of measurements, and distribution of temperatures during the day) are most relevant.

w.

]]>distribution of those vs. digital. However in addition there is some distribution of actual vs. reading errors due to calibration problems and due to systemic effects as people have noted re: urban heat island effect, other factors. Simply averaging values if there is eg a systemic factor leading to higher readings does not magically get rid of that effect and make the result more likely to be true.

I haven’t looked into this stuff in detail.. but I’m curious how often the pressure/density of the atmosphere, and its water vapor content, etc, are measured also at the same points/times within the record so that those factors can be taken into account since there is a meaningful difference in the weight that should be given to a temperature reading depending on the density of the atmosphere, and the heat characteristics of the different substances within the atmosphere at that point.

This is clear if you consider what conceptually the notion of a global temperature might mean and its purpose. I assume that in a sense the global average temperature result is a proxy conceptually for the energy contained within the system as a whole. In reality that energy is distributed unevenly over the surface (or actually throughout the volume of the atmosphere.. but ignoring that for the moment).

Presumably the notion is that if you average the temperatures in equal size block chunks over the globe you’ll get a meaningful average temperature for the whole. One factor I’m curious if they’ve taken into is that this notion really seems to depend on the idea there is an equal quantity of gas represented within each chunk so that the magnitude of the contribution to the total energy of the system for each block of the same physical size. However if the pressure and density of the gas varies then each block does not make an equal contribution to the energy of the system as a whole.

In some sense the notion of using a global temperature average might also be considered to be a proxy for the notion of what the temperature would be if the whole system were well-mixed and at equalibrium with no fluctuations over the globe… that would be the temperature. The difficulty with this is that its assuming temperature readings at each point should be given equal weight when considering the inter-mixed temperature. This ignores the issue of pressure/density varying. eg, to use an extreme that won’t happen in reality to illustrate the point.. pretend the atmosphere were twice as dense within the area of 1 temperature reading as another.. In that case the reading is a proxy for twice as much gas and presumably if everything were intermixed and at the same pressure that amount of gas would have contributed disproportionately to the resulting average temperature. Also of course the whole concept ignores issues related to temperature changes if a gas is placed under more pressure.. or less pressure.. if the atmosphere were mixed to be homogeneous.

There is also the issue that each temperature measurement is a proxy for a large geographic area within which the temperature will vary. eg, pretend you took 1 fixed temperature reading to stand in as a proxy for the global temperature.. Obviously that wouldn’t be very accurate and I’m sure there are points on the earth where that 1 temperature would show cooling, or warming, regardless of the rest of the world. Presumably localized weather patterns may change how that proxy reading site compares to what a true average over every point within that area might be (eg, el nino, etc, may shift values locally). Obviously the hope is that these factors will randomly cancel out.. but I’d be curious what distribution is of potential fluctuations and the likelihood they will all cancel out conveniently… especially given the small % of the surface (and ocean) over time that has been covered and the self-selected nature of much of that re: proximity to human civilizations or trade routes, etc, whose distribution may change as weather patterns change.

I’d also be curious how well they take into account variations in the distribution of temperature throughout a day vs. the time the temperature was taken

due to taking temperatures at the same time each day while the angle of the sun seasonally varies at that clock time.. Also if the temperature were continually taken throughout the day there would be a distribution curve of temperatures.. Taking eg 2 readings in a day and averaging them isn’t necessarily going to come up with real average for the day depending on the distribution of temperatures throughout the day. (and as the seasons vary, and localized weather changes.. the location of those 2 measurements within the daily distribution curve will vary). Again, hopefully randomness will cancel out various factors.. but what level of random fluctation might potentially exist..

I was referring to an error range independent of the error distribution. The actual range of potential values is *not* reduced by averaging 2 values,

magically.. it remains +-0.1 What may change is the distribution which I noted with certain error distributions, such as a normal distribution, would bunch up towards the middle as is effectively what you were saying, increasing the probability the result in the example is closer to 5. However the point is that if you don’t know the error distribution… all you can say for sure is that the error range is still +- 0.1 (and regardless of the distribution of likelihood of the “real value” within that result..that is still the range.. even if the outlying values are less likely).

The point was also that I’m curious if eg re: reading errors if people do know and take into account the true distribution of reading errors. ie if there is a psychological basis for people to read higher over time.. then that increases the likelihood that the actual value is less than the individual readings.. and less than the averages. If people are much more likely to read 5 rather than 4.9 due to unintentional/unaware bias.. (eg, even though its not likely, pretend 99% of the time they’ll read 4.9 as 5) then averaging values doesn’t change this and the “real” value is likely 4.9 rather than 5… and by not knowing the true error distribution your method of using statistic to imply that the average value is somehow more likely to be 5 based on two biased readings is wrong, misleading and an inappropriate way to use statistics..

]]>I think what you are trying to say is that quantisation errors are correlated, so if the true value is 5.028 you’ll get 5.0 or 5.1 reported depending on the observer’s personal biases, and if the true value changes to 5.032, you still get 5.0 and 5.1 reported with the same frequencies. The convergence of the mean is to the mean of the quantised distribution, and for the accuracy improvement to work, the relative frequency of reporting 5.0 or 5.1 would have to change slightly as the true value went from 5.028 to 5.032. In practice, whether people round up or down is not sensitive to the precise value – for such a value they will almost always round down: very strong correlation.

This is clearly the case for a single quantity measured directly. But in climate measurements there are a whole range of temperatures that occur, and the quantisation error is each case will be different. One day the true value is 5.038, the next it is 7.212, the next 3.819. Now whether the quantisation pushes it up or down depends on whether those true values close to the half way points change relative frequencies depending on the precise value; the unambiguous ones always get pushed the same way. Now I could believe 5.04 might get treated noticably differently from 5.06, rounded down more often than up, but will 5.0499 really give a one in ten thousand change in ratio compared to 5.0501? Or will it be down entirely to observer peculiarities, as to what they habitually do when the point is indistinguishable from the half-way point?

It doesn’t matter whether the distribution is Gaussian or not – the central limit theorem applies to any distribution with finite variance. But it does very definitely matter that quantisation errors are correlated with the true value, and with each other. You can improve a little on the nominal quantisation by averaging, since some of the ambiguous cases can be influenced by where they are, but below a certain resolution no information can get through the filter of observation, and the accuracy cannot be improved beyond this point.

]]>The error value is a statement about the probable size of an error. In your example, if the value is 5 and the standard error is 0.1, there is about a 95% chance that the true answer is between 4.8 and 5.2. But the largest and smallest numbers are less probable. For example, there is about a 60% chance that the true value is between 4.9 and 5.1.

Now, suppose we average two such values. If the errors are independent, some of the time they will cancel each other. And only occasionally will the two extreme values occur at the same time. Because half of the time the errors are in different directions (one negative, one positive), the average of the two values will have a narrower range of probable errors.

This can be confirmed by the thought experiment of averaging say 10,000 such values, each with an error of 0.1. You will *not* get an average of 5 ⯠0.1 as you claim. You will come out with an average of 5 plus or minus a very small number, because the errors will average out. There is nothing “magical” about this as you seem to think. As long as the errors are symmetrically distributed and are independent, you ** do** gain accuracy by averaging them. In this thought experiment, the answer is 5 ⯠0.001. The general formula for the average of a number of measurements of some value with the same standard error is

w.

]]>In fact, assuming the errors are not correlated, the standard error of the mean is not 0.1 as you say in your example.

Uncorrelated errors add “othogonally”. …and the error of the average will be half of that, or 0.07.

Lets try a reality check here. Say measurement A is 5 +- 0.1 ie from 4.9 to 5.1 Measurement B is 5 +- 0.1 ie again fro 4.9 to 5.1

Therefore the average is 5.. with the average of the low as 4.9 and the high is 5.1 … ie uncertainty is still 0.1

You don’t magically gain precision by averaging the values. I don’t know what the error distribution curve is like within that range (whether its a normal distribution curve or not.. depends on how much of this is reading uncertainty or calibration uncertainty, etc..) so while granted it would likely bunch up towards the center and make 5 more likely to be the result.. that can’t be known without knowing the error distribution.

I’d even wonder whether re: reading errors if there was any bias due to unconscious expectation among those who expect global warming towards being more likely to perceive the result as higher than it “is”. Similar perceptual expectation biases have been observed in psychological experiments. This could lead to a self fulfilling prophecy effect.. Over time there would be an upward drift as there would likely be (& have been) in increase in the number of people who expect global warming (or hope to see it to confirm their bias).. and an increase in how strongly they believe it and are inclined to be thinking about it as a potential factor that might impact their reading (vs. the times when they know its supposed to be a cold day and perhaps physically feel cold and are biased in that direction, or conversely on a hot day).

btw, using measurements to represent such a large geographic area can lead to the risk of more errors than just the localized heat island effects. eg, normal local weather patterns that shift so that eg jet stream or water currents shifting slightly over time relative to some fixed measurement points may alter temperature readings over one or more spots without necessarily changing readings elsewhere to compensate. The hope presumably is that on a random basis such random shifts would cancel each other out.. but I don’t have enough information to assess whether random chance might lead to some fluctuation up or down.. perhaps even for several years..

]]>