HadCRU Sept 1850

For some reason, the main HadCRU global data set http://hadobs.metoffice.com/crutem3/diagnostics/global/nh+sh/monthly is missing September 1850. I noticed this when the plot came up one month short. I wonder why this month went AWOL.

21 Comments

  1. Posted Jul 22, 2009 at 11:23 AM | Permalink

    That is curious. It looks like two years ago there were some data from September 1850 😉 See http://www.unur.com/climate/crutem3-1850-2007.html

  2. Steve McIntyre
    Posted Jul 22, 2009 at 11:37 AM | Permalink

    I’m sure that it exists.

    It’s not a big deal, but annoyed me when I tried to make a time series and came up one short. I had to do add a date-checking step in my read script.

    From an operating point of view, it’s odd that data would go AWOL.

    • Geoff Sherrington
      Posted Jul 22, 2009 at 6:44 PM | Permalink

      Re: Steve McIntyre (#2),

      This missing year syndrome exists at station level in quite a few datasets. I now run a consecutive number test routinely. Found in GHCN and sometimes in KNMI, often around 1880-90 period.

      • Andrew
        Posted Jul 22, 2009 at 7:11 PM | Permalink

        Re: Geoff Sherrington (#7), Geoff, are you sure you set KNMI to only needing 1% coverage? High numbers can make it throw out years with insufficient data.

      • Steve McIntyre
        Posted Jul 22, 2009 at 9:50 PM | Permalink

        Re: Geoff Sherrington (#7), On station data, I include the check in my get routine. In such cases, I presume that it’s actual missing data. Does anyone have any idea on what would have caused this error?

        • Mike Lorrey
          Posted Jul 22, 2009 at 10:13 PM | Permalink

          Re: Steve McIntyre (#10), I have long since ceased ascribing to incompetence what can only be described as a willful pattern of malfeasance.

        • Geoff Sherrington
          Posted Jul 24, 2009 at 6:44 AM | Permalink

          Re: Steve McIntyre (#10),

          There are cases from Aust where the BOM seems to have done a late discard of data but some compilers have left some in, with imperfections unstated. So it’s not in recent BOM files but it’s in earlier, but quite recent, compilations by others. It’s easy enough to pick up a year of missing data when shown as monthly or daily, but some annual sets look to have continuous years until you find that there is a “camouflaged” gap of a year in the middle of the data here and there. Example –

          Cape Otway, aka Otway, has:

          BOM 1993 vintage starting 1880
          BOM 2009 web version starting 1865 but missing 1867 and a lot of 1868 and 1869
          GISS homogenised starts 1901
          KNMI GHCN ver2 adjusted WMO 94821 starts 1865 but has values for 1867-9 where the BOM recent does not always.

          The answer as I see it is that compilers make subjective decisions on how much to in-fill and do so without much or any metadata comment. Risky.

  3. jeez
    Posted Jul 22, 2009 at 1:52 PM | Permalink

    Well they certainly make it clear that they don’t want any history recorded by web spiders such as those at archive.org.

    From their robots.txt file
    User-agent: *
    Disallow: /hadcet/data/
    Disallow: /crutem3/data/
    Disallow: /emslp/data/
    Disallow: /en3/data/
    Disallow: /gisst/data/
    Disallow: /gmslp/data/
    Disallow: /hadcet/data/
    Disallow: /hadcrut3/data/
    Disallow: /hadex/data/
    Disallow: /hadgem_sst/data/
    Disallow: /hadghcnd/private/
    Disallow: /hadgoa/data/
    Disallow: /hadisst/data/
    Disallow: /hadrt/data/
    Disallow: /hadsst2/data/
    Disallow: /hadukp/data/
    Disallow: /mohmat/data/
    Disallow: /mohsst/data/
    Disallow: /template/data/
    Disallow: /hadat/uncertainty/
    Disallow: /hadat/hadat2/
    Disallow: /hadat/audit/
    Disallow: /hadat/msu/
    Disallow: /haddtr/data/
    Disallow: /hadcruh/data/
    Disallow: /urban/data/
    Disallow: /hadslp2/data/
    Disallow: /quarc/private

  4. dearieme
    Posted Jul 22, 2009 at 4:04 PM | Permalink

    Potato blight.

  5. Andrew
    Posted Jul 22, 2009 at 6:19 PM | Permalink

    I seem to recall it existing before..what could have happened to it?

    These danged files constantly changing makes doing detailed analysis very difficult.

  6. Geoff Sherrington
    Posted Jul 22, 2009 at 6:35 PM | Permalink

    In September 1850 California became the 31st State of the Union and the world has not been normal since.

  7. Basil
    Posted Jul 22, 2009 at 8:30 PM | Permalink

    Is this it?

    1850/09 -0.513 -0.427 -0.599 -0.211 -0.815 -0.463 -0.557 -0.199 -0.827 -0.195 -0.830

    That’s what I’m seeing there now.

  8. tty
    Posted Jul 23, 2009 at 12:58 AM | Permalink

    Re #3

    Can anyone think of a legitimate reason for blocking archiving of scientific data?

    • jeez
      Posted Jul 23, 2009 at 3:53 AM | Permalink

      Re: tty (#12),

      There may be alternate explanations, such as not wanting web search engines to characterize the site by certain directories, but in this case, despite Steve’s dislike of ascribing motives, I think a lot of site owners nowadays want control of archiving and not have external groups such as archive.org engage in it for them. Whether this is legitimate or not in cases of scientific data is likely a subject of debate. A semi-legitimate example: would you want a corrupted data set that is quickly identified and corrected archived for posterity? Maybe not.

      • TAG
        Posted Jul 23, 2009 at 5:14 AM | Permalink

        Re: jeez (#13),

        Are the data versioned with change orders describing the reasons for each new version?

        • jeez
          Posted Jul 23, 2009 at 2:12 PM | Permalink

          Re: TAG (#14),

          Of course, and all this information can be found online, in the same place as the flowery meadows and rainbow skies, and rivers made of chocolate, where the children danced and laughed and played with gumdrop smiles.

  9. Posted Jul 23, 2009 at 3:27 PM | Permalink

    RE Basil, #9,

    Is this it?

    1850/09 -0.513 -0.427 -0.599 -0.211 -0.815 -0.463 -0.557 -0.199 -0.827 -0.195 -0.830

    That’s what I’m seeing there now.

    I still just see a blank line for Sept 1850 at http://hadobs.metoffice.com/crutem3/diagnostics/global/nh+sh/monthly, even after I refresh my screen. Perhaps you’re seeing a cache of an older version?

    • Basil
      Posted Jul 23, 2009 at 4:03 PM | Permalink

      Re: Hu McCulloch (#16),

      Hu,

      I don’t know, maybe it is a cached version, but it would be a relatively recent cache because it goes through June, and they just updated to include the June results. Actually, it is all mystifying, because at the time I saw Steve’s post, I had the page through May loaded into an IE browser tab, where I’ve been refreshing it periodically waiting for the June update. So I went in, saved the May data to a file, refreshed, got it through June, and the line for 1850/09 was there.

      But now, like you say, I’m at a different computer, in a different location, and the line is missing. So I don’t know what is going on. When I get home, on the computer where I have a copy of the June numbers with that line in it, I’ll refresh and see what happens.

      Probably unrelated, I’ve noted lately that that page will not load directly into Firefox, my browser of choice, whereas it used to. Now, Firefox wants to “download” it. I don’t know if they changed something in the header structure of the file, or Firefox has changed how it handles something. It is a little irritating, though, for it not to load directly into Firefox, and have to use IE instead.

  10. Posted Jul 23, 2009 at 9:07 PM | Permalink

    RE #17,
    I can’t see 9/1850 with either IE or Firefox on my home computer, so it doesn’t seem to be a browser issue.

    • Basil
      Posted Jul 23, 2009 at 10:05 PM | Permalink

      Re: Hu McCulloch (#18), And I can see it with both IE and Firefox on my home computer, whereas I couldn’t at school today.

      Weird.

  11. Posted Jul 24, 2009 at 3:35 AM | Permalink

    Basil, I think your numbers are hadcrut3, ie land+ocean, from

    http://hadobs.metoffice.com/hadcrut3/diagnostics/global/nh+sh/monthly

    Sept 1850 is indeed missing from the crutem3 (land only) data at

    http://hadobs.metoffice.com/crutem3/diagnostics/global/nh+sh/monthly

    Of course it does not help that both files are named ‘monthly’!

    But the complete data is available (with more sensible filenames) at
    http://www.cru.uea.ac.uk/cru/data/temperature/crutem3gl.txt
    http://www.cru.uea.ac.uk/cru/data/temperature/hadcrut3gl.txt
    The missing number, if anyone cares, is -0.248.

    .

    There is another interesting feature of crutem3gl.txt that I had not noticed before. It tells us that coverage has fallen from a peak of 37% in the 1960s to 25% today.
    This suggests that the number of stations used is falling, like GISS.
    It is also a bit fishy, because only 30% of the Earth’s surface is land, so how can crutem cover 37%? Or does it mean only 37% of the land?
    Also fishy is the fact that this decrease in coverage does not seem to show up in hadcrut3 which claims a fairly steady coverage of 80%.