BEST stated that one of their distinctive skills was their supposed ability to use short station histories.
This seems to include station histories as short as a single data point. In the BEST station data, there are 130 singletons. An example is Cincinnati Whiteoak which has one record as shown below:
# 137532 1 1970.125 12.187 0 28 18
# 137533 1 1966.208 6.575 0 28 18
# 137534 1 1836.042 9.259 0 -99 -99
I wonder how they incorporate such singletons into their program. And why. And why the record is shown as a singleton. I’m 99.9% sure that this is within the data and not an error in my collation as I triple checked. (But this is new data for me and it is not impossible that I’ve made an error somewhere in my collation, but I don’t think so.)
Update -Dmitri observes that the paper says: “A further 0.2% of data was eliminated because after cutting and filtering the resulting record was either too short to process (minimum length ≥6 months)…”. Bill verified the singletons. So they kick the singletons – but the main puzzle remains: are the singletons real in the underlying data? Or are they an artifact of the collation?
4 Comments
I see that record (#137533) in my data and can confirm it’s a singleton.
137532 1 1970.125 12.187 0 28 18
137533 1 1966.208 6.575 0 28 18
137534 1 1836.042 9.259 0 -99 -99
Interestingly, I count 126 singleton stations in the US. So only 4 more worldwide? Something about one of the US data sources?
Well, according to their main paper, “A further 0.2% of data was eliminated because after cutting and filtering the resulting record was either too short to process (minimum length ≥6 months)…”. I guess it means that singletons were kicked out as well.
The singleton 137533 also has a frequency of 1.
w.
When you process down to daily data you get singletons.