Comments on: Another "High Quality" USHCN station

By: Anonymous

Anonymous — Thu, 03 Jun 2010 14:41:25 +0000

[…] una bella foto della stazione dell'Idrografico di Bolzano: Scherzo. Questa viene da qui: Another "High Quality" USHCN station Climate Audit […]

By: Watts Up With That? « Rick’s Weblog

Watts Up With That? « Rick’s Weblog — Tue, 15 Jul 2008 05:19:57 +0000

[…] Detroit Lakes MN last week, surveyed by volunteer Don Kostuch, and cross posted it to the website http://www.climateaudit.org/?p=1828#comments that had two air conditioner units right next to it. It looked like an obvious cause and effect […]

By: Eli Rabett

Eli Rabett — Thu, 02 Aug 2007 04:00:40 +0000

You might go read the station history. There is a problem but it is almost certainly not the air conditioner or the building

By: Sam Urbinto

Sam Urbinto — Mon, 30 Jul 2007 19:58:03 +0000

The question is: Are the individual measurements accurate enough to result in daily min/max numbers accurate enough to combine to get a true .01 or better resolution for the mean on a monthly basis? Is the monthly reading for October 1990 66.11 or 66.22 — Or is in fact the monthly for Oct 1990 66 +/- 1 or +/- .1 or even 66 +/- .01 ? Or can we get it to 66 +/- .001 or better?

I doubt it, but even if we are getting an accurate 66 +/- .000001, all we’re getting how the material acts and how it mixes with air 5 feet up.

Plus, I’d think it’s more like 66 +/- .1 anyway. And even if not, and we consider the temp as accurate and indicitive of the location of the thermometer, it’s not necessarily indicitive of the area “being measaured.”

By: Steven B

Steven B — Sun, 29 Jul 2007 22:27:22 +0000

Dan,

It’s not necessarily limited to the exact same property, but it gets more complicated for other cases. It’s just a matter of maths with random variables. The variance of a sum of random variables is the total of all the numbers in their joint covariance matrix – whether the random variables estimate one quantity or many. If you take the mean by dividing the total by n, the variance of this mean is divided by n^2. All that matters is adding up random variables and dividing by a constant, but in context it wouldn’t make much sense to be averaging estimates of different quantities. (If what you’re talking about is the fact that the actual temperature is different at each physical location, that’s OK. If you average these different values to get an estimate of the average global temperature, the average of the errors in them is still supposed to be zero, and our variance calculates the variance in that average.)

For temperatures at different times you’ll commonly get a more complicated covariance structure. Observations close together in time will be more strongly correlated than those far apart. (To some degree, this applies in the spatial case as well.) Depending on whether this decays to zero and how fast, you can get convergence of the mean, but it will be a lot slower than you expect.

In the other thread I give a link to a reference. This quotes the only bit of this argument that isn’t derived from perfectly standard properties of variances, which is the fact that the variance of a sum of correlated data is the sum of all the elements of the covariance matrix. Every stats book I’ve come across always assumes the special case of perfect independence. When I first wanted the result myself, I had to work it out from first principles; it involves some messy algebra, but is pretty straightforward as maths goes. It seems like it should be a standard result, but it’s rare to even find it simply quoted (as my linked reference does) let alone derived.

If you check my Milestone post, you’ll see the formula when the covariances are constant off diagonal is Var(y-bar) = (r +(1-r)/n)Var(y_i) where y is the data being averaged, r the cross-correlation coefficient, and n the sample size. So if r is 0.001, and Var(y_i) is 0.25, then the average of 100 points will have variance (0.001 + 0.999/100)*0.25 = 0.01099*0.25 ~= 0.0025 which gives an accuracy about ten times better. But if we increase the sample size to a million points, we get (0.001 + 0.999/1000000)*0.25 = 0.001000999*0.25 ~= 0.00025 which gives an accuracy only three times better, even though the naive prediction would be that it should be 100 times better. Increasing the sample size any more will make virtually no difference to the accuracy of the average.

Obviously, I don’t know how independent temperature measurements actually are. It would take a detailed knowledge of the entire measurement process to even make a guess. If you’ve got a huge correlation, around 0.1, then those hundredth of a degree accuracies are in major trouble. If r is down around 0.001 as in the above example, then it’s probably not an issue in the situation under discussion. But if you think about how temperature anomaly (as opposed to the mean temperature) is calculated, how likely do you think it is that the correlation is so small?

My apologies for the length of the post. I can do a more detailed discussion of any gaps tomorrow if people ask for it, and if Steve M doesn’t object, but I hope that with these pointers it should be possible to leave it as ‘an exercise for the student’. 🙂

By: Derek Kite

Derek Kite — Sun, 29 Jul 2007 21:49:55 +0000

They are air conditioning condensing units. RUUD brand name. The closest one is quite recent, probably 13 SEER, so maybe 2 years old.

If it is a radio station, they probably run all year round.

Derek

By: Dan Hughes

Dan Hughes — Sun, 29 Jul 2007 21:05:07 +0000

re: #36 Thanks for the heads up Steven. Could that be limited to 100 estimates of the exact same property? Does it apply to measurement of the temperature at the same location but at different times? Do you have a handy reference for us? I have John Mandel, The Statistical Analysis of Experimental Data, John R. Taylor, An Introduction to Error Analysis and F. B. Hildebrand, Introduction to Numerical Analysis. An example calculation and analysis would also be very helpful.

By: L Nettles

L Nettles — Sun, 29 Jul 2007 19:25:34 +0000

On the issue of cable length for the MMTS, I will note that 3 of the sites I have inspected appear to have cable lengths longer that the standard 15 feet or so. Newberry, SC, Cheraw SC and Sumter SC. Only Cheraw is currently posted.

By: Magnus Andersson

Magnus Andersson — Sun, 29 Jul 2007 18:57:36 +0000

Just a Ot comment. I hope they got good fans around the termometers outside the city of Sundsvall. About plus 5 degree Celsius and just a bit snowy today. An image and an article:

http://www.aftonbladet.se/vss/nyheter/story/0,2789,1129554,00.html

The AGW alarmism goes on in Sweden anyway. Our prime minister (who I voted for) has dropped all other questions he sais… 😛 (pure insanity)

By: Steven B

Steven B — Sun, 29 Jul 2007 17:01:00 +0000

Re 35, and others,

Multiple estimates can give you more significant figures, because you’re getting more data. 100 data points with 3 significant figures each is actually 300 significant figures of data. Nearly of that is redundant – the same bit of information repeated – but if the errors are even partly independent, then it’s equivalent to more than 3 digits of information in total.

(The business about systematic biases is a separate matter, and you’re all quite right about that. The error in the average converges (if at all) on the average of the systematic biases. There are other problems too.)

As I point out in another thread (Milestone), there’s a limit beyond which you cannot go. Collecting more data eventually becomes entirely redundant, and you cannot simply go on taking bigger sample sizes to get indefinite improvement in accuracy. But it isn’t correct to say you can’t get any improvement by averaging, either. You can get a bit, but then it stops.

The people running these networks don’t seem to have thought about it either, and until they do the calculation they can’t be sure they can really get 0.01C accuracy or whatever. They should be rightly criticised for that. But that doesn’t mean you can’t get any improvement – and if you keep on insisting it does, they’ll dismiss the entire argument with contempt.