For some reason, the main HadCRU global data set http://hadobs.metoffice.com/crutem3/diagnostics/global/nh+sh/monthly is missing September 1850. I noticed this when the plot came up one month short. I wonder why this month went AWOL.
-
Tip Jar
-
Pages
-
Categories
-
Articles
-
Blogroll
- Accuweather Blogs
- Andrew Revkin
- Anthony Watts
- Bishop Hill
- Bob Tisdale
- Dan Hughes
- David Stockwell
- Icecap
- Idsos
- James Annan
- Jeff Id
- Josh Halpern
- Judith Curry
- Keith Kloor
- Klimazweibel
- Lubos Motl
- Lucia's Blackboard
- Matt Briggs
- NASA GISS
- Nature Blogs
- RealClimate
- Roger Pielke Jr
- Roger Pielke Sr
- Roman M
- Science of Doom
- Tamino
- Warwick Hughes
- Watts Up With That
- William Connolley
- WordPress.com
- World Climate Report
-
Favorite posts
-
Links
-
Weblogs and resources
-
Archives
21 Comments
That is curious. It looks like two years ago there were some data from September 1850 đ See http://www.unur.com/climate/crutem3-1850-2007.html
I’m sure that it exists.
It’s not a big deal, but annoyed me when I tried to make a time series and came up one short. I had to do add a date-checking step in my read script.
From an operating point of view, it’s odd that data would go AWOL.
Re: Steve McIntyre (#2),
This missing year syndrome exists at station level in quite a few datasets. I now run a consecutive number test routinely. Found in GHCN and sometimes in KNMI, often around 1880-90 period.
Re: Geoff Sherrington (#7), Geoff, are you sure you set KNMI to only needing 1% coverage? High numbers can make it throw out years with insufficient data.
Re: Geoff Sherrington (#7), On station data, I include the check in my get routine. In such cases, I presume that it’s actual missing data. Does anyone have any idea on what would have caused this error?
Re: Steve McIntyre (#10), I have long since ceased ascribing to incompetence what can only be described as a willful pattern of malfeasance.
Re: Steve McIntyre (#10),
There are cases from Aust where the BOM seems to have done a late discard of data but some compilers have left some in, with imperfections unstated. So it’s not in recent BOM files but it’s in earlier, but quite recent, compilations by others. It’s easy enough to pick up a year of missing data when shown as monthly or daily, but some annual sets look to have continuous years until you find that there is a “camouflaged” gap of a year in the middle of the data here and there. Example –
Cape Otway, aka Otway, has:
BOM 1993 vintage starting 1880
BOM 2009 web version starting 1865 but missing 1867 and a lot of 1868 and 1869
GISS homogenised starts 1901
KNMI GHCN ver2 adjusted WMO 94821 starts 1865 but has values for 1867-9 where the BOM recent does not always.
The answer as I see it is that compilers make subjective decisions on how much to in-fill and do so without much or any metadata comment. Risky.
Well they certainly make it clear that they don’t want any history recorded by web spiders such as those at archive.org.
From their robots.txt file
User-agent: *
Disallow: /hadcet/data/
Disallow: /crutem3/data/
Disallow: /emslp/data/
Disallow: /en3/data/
Disallow: /gisst/data/
Disallow: /gmslp/data/
Disallow: /hadcet/data/
Disallow: /hadcrut3/data/
Disallow: /hadex/data/
Disallow: /hadgem_sst/data/
Disallow: /hadghcnd/private/
Disallow: /hadgoa/data/
Disallow: /hadisst/data/
Disallow: /hadrt/data/
Disallow: /hadsst2/data/
Disallow: /hadukp/data/
Disallow: /mohmat/data/
Disallow: /mohsst/data/
Disallow: /template/data/
Disallow: /hadat/uncertainty/
Disallow: /hadat/hadat2/
Disallow: /hadat/audit/
Disallow: /hadat/msu/
Disallow: /haddtr/data/
Disallow: /hadcruh/data/
Disallow: /urban/data/
Disallow: /hadslp2/data/
Disallow: /quarc/private
Potato blight.
I seem to recall it existing before..what could have happened to it?
These danged files constantly changing makes doing detailed analysis very difficult.
In September 1850 California became the 31st State of the Union and the world has not been normal since.
Is this it?
1850/09 -0.513 -0.427 -0.599 -0.211 -0.815 -0.463 -0.557 -0.199 -0.827 -0.195 -0.830
That’s what I’m seeing there now.
Re #3
Can anyone think of a legitimate reason for blocking archiving of scientific data?
Re: tty (#12),
There may be alternate explanations, such as not wanting web search engines to characterize the site by certain directories, but in this case, despite Steve’s dislike of ascribing motives, I think a lot of site owners nowadays want control of archiving and not have external groups such as archive.org engage in it for them. Whether this is legitimate or not in cases of scientific data is likely a subject of debate. A semi-legitimate example: would you want a corrupted data set that is quickly identified and corrected archived for posterity? Maybe not.
Re: jeez (#13),
Are the data versioned with change orders describing the reasons for each new version?
Re: TAG (#14),
Of course, and all this information can be found online, in the same place as the flowery meadows and rainbow skies, and rivers made of chocolate, where the children danced and laughed and played with gumdrop smiles.
RE Basil, #9,
I still just see a blank line for Sept 1850 at http://hadobs.metoffice.com/crutem3/diagnostics/global/nh+sh/monthly, even after I refresh my screen. Perhaps you’re seeing a cache of an older version?
Re: Hu McCulloch (#16),
Hu,
I don’t know, maybe it is a cached version, but it would be a relatively recent cache because it goes through June, and they just updated to include the June results. Actually, it is all mystifying, because at the time I saw Steve’s post, I had the page through May loaded into an IE browser tab, where I’ve been refreshing it periodically waiting for the June update. So I went in, saved the May data to a file, refreshed, got it through June, and the line for 1850/09 was there.
But now, like you say, I’m at a different computer, in a different location, and the line is missing. So I don’t know what is going on. When I get home, on the computer where I have a copy of the June numbers with that line in it, I’ll refresh and see what happens.
Probably unrelated, I’ve noted lately that that page will not load directly into Firefox, my browser of choice, whereas it used to. Now, Firefox wants to “download” it. I don’t know if they changed something in the header structure of the file, or Firefox has changed how it handles something. It is a little irritating, though, for it not to load directly into Firefox, and have to use IE instead.
RE #17,
I can’t see 9/1850 with either IE or Firefox on my home computer, so it doesn’t seem to be a browser issue.
Re: Hu McCulloch (#18), And I can see it with both IE and Firefox on my home computer, whereas I couldn’t at school today.
Weird.
Basil, I think your numbers are hadcrut3, ie land+ocean, from
http://hadobs.metoffice.com/hadcrut3/diagnostics/global/nh+sh/monthly
Sept 1850 is indeed missing from the crutem3 (land only) data at
http://hadobs.metoffice.com/crutem3/diagnostics/global/nh+sh/monthly
Of course it does not help that both files are named ‘monthly’!
But the complete data is available (with more sensible filenames) at
http://www.cru.uea.ac.uk/cru/data/temperature/crutem3gl.txt
http://www.cru.uea.ac.uk/cru/data/temperature/hadcrut3gl.txt
The missing number, if anyone cares, is -0.248.
.
There is another interesting feature of crutem3gl.txt that I had not noticed before. It tells us that coverage has fallen from a peak of 37% in the 1960s to 25% today.
This suggests that the number of stations used is falling, like GISS.
It is also a bit fishy, because only 30% of the Earth’s surface is land, so how can crutem cover 37%? Or does it mean only 37% of the land?
Also fishy is the fact that this decrease in coverage does not seem to show up in hadcrut3 which claims a fairly steady coverage of 80%.