GISS has been providing a considerable amount of intermediate information on their results. Unfortunately, it’s been provided in binary format that is presumably suited for people who speak Fortran with a Unix accent. I presume that such people converse with one another in medieval Latin. It’s not very handy for people who use modern languages.
Nicholas sent me some binary scripts a while ago, which I’ve modified to make organized dset1 and dset2 data sets. The form of organization is: the object dset1 is a list of 7364 items, one for each row of the GISS station data table, with each item “named” by its 11 digit id number. For dset1 and dset2, there can be 0, 1, 2 or more versions for each station. I’ve made a list of the time series, so that there are 0,1,2,,, time series for each of the 7364 items. I’ve checked that these reconcile to the webpage results. Each file is about 7-8 MB, about half the combined size of the 6(!) binary files that GISS uses to store the data.
An annoying feature of Hansen’s intermediate files is that the ID numbers are related to, but different from the ID numbers in the GISS information file. I’ve done a concordance and checked the concordance, but really you’d think that they could manage to keep one set of ID numbers. Even if you prefer another language to R (which you shouldn’t :) ), it’s probably easier to port this sort of organized information out of R into another language. This program supercedes my previous scraping program. The program requires giss.info.tab for names.
I use the giss.info.tab table to keep track of stations. To plot (or recover) Verhojansk data, here’s a simple command in which the letters “VERHO” are looked up, the id recovered and the series located.