A Collation Utility for GISS dset1 and dset2

GISS has been providing a considerable amount of intermediate information on their results. Unfortunately, it’s been provided in binary format that is presumably suited for people who speak Fortran with a Unix accent. I presume that such people converse with one another in medieval Latin. It’s not very handy for people who use modern languages.

Nicholas sent me some binary scripts a while ago, which I’ve modified to make organized dset1 and dset2 data sets. The form of organization is: the object dset1 is a list of 7364 items, one for each row of the GISS station data table, with each item “named” by its 11 digit id number. For dset1 and dset2, there can be 0, 1, 2 or more versions for each station. I’ve made a list of the time series, so that there are 0,1,2,,, time series for each of the 7364 items. I’ve checked that these reconcile to the webpage results. Each file is about 7-8 MB, about half the combined size of the 6(!) binary files that GISS uses to store the data.

An annoying feature of Hansen’s intermediate files is that the ID numbers are related to, but different from the ID numbers in the GISS information file. I’ve done a concordance and checked the concordance, but really you’d think that they could manage to keep one set of ID numbers. Even if you prefer another language to R (which you shouldn’t 🙂 ), it’s probably easier to port this sort of organized information out of R into another language. This program supercedes my previous scraping program. The program requires giss.info.tab for names.

source(“http://data.climateaudit.org/scripts/station/giss/collate.dset.nicholas.txt”)
download.file(“http://data.climateaudit.org/data/giss/giss.info.tab”,”temp.dat”,mode=”wb”);load(“temp.dat”)
dset1=collate.giss(m=1,month=10,year=2008)
#save(dset1,file=”d:/climate/data/station/giss/200810/dset1.tab”)
#dset2=collate.giss(m=2,month=10,year=2008)

I use the giss.info.tab table to keep track of stations. To plot (or recover) Verhojansk data, here’s a simple command in which the letters “VERHO” are looked up, the id recovered and the series located.

plot.ts(dset2[[paste( giss.info$id[grep(“VERHO”,giss.info$site)])]][[1]])

16 Comments

  1. Posted Nov 18, 2008 at 5:30 PM | Permalink

    Some time ago I spent some time animating GISS data. Unfortunately, life intervened in the mean time, and I have not updated the animations since then. However, the page I put up then contains an explanation of the GISS data file format (as it was at the time):

    http://www.unur.com/climate/giss-1880-2006.html

    — Sinan

  2. Steve McIntyre
    Posted Nov 18, 2008 at 5:36 PM | Permalink

    HI, Sinan. I’ve got a clean script for decoding the gridded into R as well (thanks to the ingenious Nicholas). I’ll post it up. Making people sort out big-endian and little-endian is pretty bizarre in this day and age.

    • Posted Nov 18, 2008 at 5:43 PM | Permalink

      Re: Steve McIntyre (#2), Yup, I am going to download your R code so that I can play around with these data again the moment I get a chance. You have discovered quite a few errors since I put together that animation and I would like to create an animation of differences between the current version of the data set and the version I used back then.

      BTW, congratulations on the recent discoveries. I was very excited today because this time I did not have to tell anyone about them. People were stopping me in the hallway and telling me about the WARMEST!!! October episode.

      I am simply thankful that you and Anthony are doing the hard work so that I have a neverending supply of stories to tell in Stats class 😉

      — Sinan

  3. Keith W.
    Posted Nov 18, 2008 at 5:40 PM | Permalink

    Hey, Steve, you might want to check the name you use for the Giss Info file. GISS.info provides a link that goes to a art website.

  4. Posted Nov 19, 2008 at 8:06 AM | Permalink

    newbie here! by the way, what is giss???

  5. Hank
    Posted Nov 19, 2008 at 8:56 AM | Permalink

    #6, People discussing climate have an annoying habit of throwing around acronyms. GISS stands for Goddard Institute for Space Studies. Or more properly the NASA Goddard Institute for Space Studies. If you want to know what R is: http://www.r-project.org/
    Another one that got me is UHI which simply stands for the phrase urban heat island, and of course AGW stands for anthropomorphic global warming – the dreaded fifth horseman of the apocalypse …….. (in the minds of some).

  6. Vincent Guerrini Jr
    Posted Nov 19, 2008 at 9:03 AM | Permalink

    Desperate RC
    http://www.realclimate.org/index.php/archives/2008/11/mind-the-gap/#more-611

  7. Hank
    Posted Nov 19, 2008 at 12:20 PM | Permalink

    #6, Correction: Anthropogenic Global Warming not Anthropomorphic…..

  8. Jryan
    Posted Nov 19, 2008 at 12:51 PM | Permalink

    Anthropomorphic Global Warming is the Heat Miser from the old Rudolph Christmas special.

    • Earle Williams
      Posted Nov 19, 2008 at 4:39 PM | Permalink

      Re: Jryan (#12),

      Hehe, thanks for the flashback!

      Extra points if you can sing the song (without googling)

  9. Pierre Gosselin
    Posted Nov 19, 2008 at 1:43 PM | Permalink

    Here’s something OT, but seems NASA is into revising ocean data too.
    http://earthobservatory.nasa.gov/Features/OceanCooling/page1.php

    Dr Jeff Masters at Wunderblog writes about the above report:
    “The NEW, CORRECTED data show that no cooling of the oceans occurred in 2004-2006, in agreement with what the climate models were predicting.”

    I’m wondering if GISS data management procedures are being applied for oceans too.

  10. Dave Andrews
    Posted Nov 19, 2008 at 4:51 PM | Permalink

    OT, but there must be problems in the climate science world. Gavin is having a go at Hadcrut 3

    http://www.realclimate.org/index.php/archives/2008/11/mind-the-gap/#comment-103776