Another Bouquet for Geert at KNMI

Geert Jan van Oldenburgh of KNMI develops and, in his part time, maintains excellent KNMI’s Climate Explorer. I recommend that Curt Covey of PCMDI examine whether they should abandon their own much less satisfactory access points.

Hadley Center as well.

The other day, I was interested in deriving a HadISST monthly tropical average. The data of interest to me was ultimately only a few hundred KB if I were able to download the monthly average – which isn’t available in a pre-calculated form. I started logically enough with the Hadley Center HadISST page, which led me to the download page.

The data of interest sits in a gzipped netcdf file (212 MB). I’m fairly handle with NetCDF and downloaded the file, which expanded to over 400 MB. Unfortunately, the file was too big to read into R in my computer and my R session crashed. I was really only interested in data after 1979 and smaller decadal files were available. These were sizes that I could manage. I wrote a little program to read the files, extract the tropical cells and take a tropical average. So far so good. However, the data proved to be in deg C, rather than anomalies. To get an anomaly, I needed to do it all over again, this time concatenating all the decadal files in order to calculate benchmark averages for each cell.

There was nothing conceptually complicated about it, but it would take another couple of hours to do. In taking these sorts of averages, you have to be careful with your array handling to ensure that you don’t make a mistake.

I remembered that Bob Tisdale had plotted some HadISST results and I was 100% certain that he hadn’t gone through this sort of elaborate exercise. Sure enough he’d used KNMI – which I’d used for extracting model data, but hadn’t thought of for HadISST.

Off to the KNMI Climate Explorer webpage. I quickly located the HadISST series, set the radio buttons to extract the TRP data and got the required anomaly series about 40 seconds later. The data handling was done on KNMI’s big computer using tested programs and only the data of interest was sent over the internet.

Another bouquet for Geert Jan’s excellent program.

And if anyone from Hadley Center or PCMDI reads this thread, suck it in and link to KNMI.


24 Comments

  1. Posted May 29, 2009 at 9:06 AM | Permalink

    Yes. They should link to KNMI. Geert Jan has done a great job making load of data much more accessible.

  2. Posted May 29, 2009 at 9:13 AM | Permalink

    And how can you not like a guy named Geert Jan van Oldenburgh?

    • Chris Schoneveld
      Posted May 31, 2009 at 3:50 PM | Permalink

      Re: Jeff Alberts (#2), “And how can you not like a guy named Geert Jan van Oldenburgh?”

      Well if you knew that it is politically incorrect at KNMI to doubt AGW you may not like the guys who have managed to keep their jobs. As we all know, Tennekes, a professor of aeronautical engineering at the Pennsylvania State University from 1965 to 1977 and thereafter director of research at the Royal Netherlands Meteorological Institute (KNMI), was forced into early retirement in 1995 because of his contrary views on global warming .

  3. Wolfgang Flamme
    Posted May 29, 2009 at 10:24 AM | Permalink

    Could someone please point me to a R-script for this?

    I’ve tried

    http://www.climateaudit.org/?p=5988#comment-341541

    in vain. It looks like that knmi.nl doesn’t recognize my email address and the pages returned don’t fit the pattern, although I’ve registered. They’ve mailed me a password but I have no idea of how to make use of that in the script. Something else I’m supposed to do first?

  4. Posted May 29, 2009 at 3:19 PM | Permalink

    Geert Jan has always answered my questions with detailed, extremely helpful explanations.

  5. pwyll
    Posted May 29, 2009 at 5:25 PM | Permalink

    You may want to try R in a 64-bit environment; you can address and use much more RAM and optimizing your problem to fit in a small amount of memory is less of a problem.

  6. Willis Eschenbach
    Posted May 29, 2009 at 5:25 PM | Permalink

    I had a couple of questions about KNMI. Geert Jan answered them quickly and professionally, and was very supportive. He is a beacon of light in a vast wasteland of so-called climate “scientists” who hide, obfuscate, delay, deny, and stick unpleasant results in their version of the CENSORED folder. Many thanks, Geert Jan.

    w.

  7. Chad
    Posted May 29, 2009 at 7:43 PM | Permalink

    I’ve had memory problems to when dealing with NetCDF. I was just extracting SST data for HadGEM1 and I ran out of memory. Better to just extract the data in decadal chunks.

  8. Geo
    Posted May 29, 2009 at 9:04 PM | Permalink

    pwyll–

    Perhaps some kind soul who already has a 64-bit environment would try it out for Steve first and let him know if it is worth the pain of shifting his platform for 64-bit goodness (maybe even virtualized OS’es so he can switch back and forth as necessary)

  9. Curt Covey
    Posted May 29, 2009 at 9:10 PM | Permalink

    The PCMDI plans to continue maintaining (and indeed expanding) its own database of climate-model output, but I see no reason why we cannot add a link to the KNMI database. Thanks to Steve et al. for alerting me to its existence. Next week I’ll talk with the people in charge of our Web site about a link.

  10. Posted May 30, 2009 at 10:30 AM | Permalink

    in holland we give a “pluim”

  11. Gary Strand
    Posted May 31, 2009 at 2:19 PM | Permalink

    The interface to the PCMDI CMIP3 data archive will be replaced this summer with vastly improved and more-efficient software, in anticipation of CMIP5.

    One issue with on-the-fly server-side processing is that unless the metadata is enhanced, it can be difficult to determine what exactly was done to the data. For example, in creating annual means, was the output data merely summed across each year and divided by 12, or was each month weighted appropriately (February thus has only (28/31, ~90%) of the weight of January, for example)? Additionally, if you want a geographic subset that doesn’t exactly match the grid boundaries, how is that handled? Lastly, if someone wants a large amount of processing done, can the hardware handle it, and, does that unfairly infringe on the equal access of others?

    One thing I’ve noticed about KNMI’s processing – it’s unclear what the original data was, and that nearly all of the critical metadata is lost in the output file. Also, the time axis of the file I created is poorly chosen – it’s defined as “years since…”, which isn’t well-defined. It says it’s a Gregorian calendar, but the original data was not. That could create interpretation problems.

  12. steven mosher
    Posted Jun 1, 2009 at 3:07 PM | Permalink

    They should just expose an interface to the database and let people write there own extraction/analysis/graphical packages.

    • Gary Strand
      Posted Jun 1, 2009 at 4:05 PM | Permalink

      Re: steven mosher (#15),
      “They should just expose an interface to the database and let people write there own extraction/analysis/graphical packages.”

      That’s likely to create a security headache.

      • Not Sure
        Posted Jun 1, 2009 at 8:05 PM | Permalink

        Re: Gary Strand (#16),

        That’s likely to create a security headache.

        Why? The data can be read-only.

  13. Gary Strand
    Posted Jun 1, 2009 at 11:01 PM | Permalink

    Why? The data can be read-only.

    Allowing access to hardware and software resources to individuals outside the institution tends to make security people very nervous. The concept of a user uploading their own code into the system just won’t be allowed.

  14. jeez
    Posted Jun 2, 2009 at 3:20 AM | Permalink

    I talked to Moshpit about this. Security concerns aside, creating an API that allows people to essentially write their own sql statements opens the system to not only abuse, but the potential for poorly written scripts or queries to peg the CPUs bringing the system to its knees. It is not practical.

    Mosh considers this a challenge to idiot proof an API.

    Mosh, you have more pertinent things to.

  15. Not Sure
    Posted Jun 2, 2009 at 11:09 AM | Permalink

    The idea of uploading code is ridiculous. All we’re talking about here is making data available. No need to upload code any more than there’s a need to upload code to Gmail to read your messages.

    There are lots of publicly available APIs on the web today. There are all sorts of ways of rate-limiting access to prevent denial of service attacks. Any security person that’s actually technically competent and not just a big-company bureaucrat knows about them and can implement them easily and quickly.

  16. Not Sure
    Posted Jun 2, 2009 at 10:07 PM | Permalink

    I had no problems opening these under 64-bit Linux. I’m not sure that it’s the 64-bitness that fixes things, though, so I’m going to try 32-bit Linux next. I used a package called “RNetCDF” instead of “ncdf” at first. Both R extensions opened the files without problems:

    R version 2.8.1 (2008-12-22)
    Copyright (C) 2008 The R Foundation for Statistical Computing
    ISBN 3-900051-07-0

    > gz data close(gz)
    > tempfile writeBin(data, tempfile)
    > close(tempfile)
    > library(ncdf)
    > nc nc2 library(RNetCDF)
    > nc3 nc4 print.ncdf(nc)
    [1] “file temp.nc has 4 dimensions:”
    [1] “lon Size: 360″
    [1] “lat Size: 180″
    [1] “time Size: 1671″
    [1] “nv Size: 2″
    [1] “————————”
    [1] “file temp.nc has 2 variables:”
    [1] “double time_bnds[nv,time] Longname:time_bnds Missval:1e+30″
    [1] “float sst[lon,lat,time] Longname:Monthly 1 degree resolution SST Missval:-1.00000001504747e+30″
    > print.ncdf(nc2)
    [1] “file ucHadISST_sst.nc has 4 dimensions:”
    [1] “lon Size: 360″
    [1] “lat Size: 180″
    [1] “time Size: 1671″
    [1] “nv Size: 2″
    [1] “————————”
    [1] “file ucHadISST_sst.nc has 2 variables:”
    [1] “double time_bnds[nv,time] Longname:time_bnds Missval:1e+30″
    [1] “float sst[lon,lat,time] Longname:Monthly 1 degree resolution SST Missval:-1.00000001504747e+30″
    > print.nc(nc3)
    dimensions:
    lon = 360 ;
    lat = 180 ;
    time = 1671 ;
    nv = 2 ;
    variables:
    float lon(lon) ;
    lon:long_name = “Longitude” ;
    lon:standard_name = “longitude” ;
    lon:units = “degrees_east” ;
    lon:actual_range = -179.5179.5 ;
    float lat(lat) ;
    lat:long_name = “Latitude” ;
    lat:standard_name = “latitude” ;
    lat:units = “degrees_north” ;
    lat:actual_range = -89.589.5 ;
    double time(time) ;
    time:long_name = “Time” ;
    time:bounds = “time_bnds” ;
    time:standard_name = “time” ;
    time:units = “hours since 1-1-1 00:00:0.0″ ;
    time:actual_range = 1638336017603232 ;
    time:delta_t = “0000-00-01 00:00:00″ ;
    double time_bnds(nv, time) ;
    float sst(lon, lat, time) ;
    sst:long_name = “Monthly 1 degree resolution SST” ;
    sst:standard_name = “sea_surface_temperature” ;
    sst:units = “degC” ;
    sst:add_offset = 0 ;
    sst:cell_methods = “time: lat: lon: mean” ;
    sst:scale_factor = 1 ;
    sst:_FillValue = -1e+30 ;
    sst:missing_value = -1e+30 ;
    sst:actual_range = -1.834.76296 ;
    sst:description = “HadISST 1.1 monthly average sea surface temperature” ;

    // global attributes:
    :Conventions = “CF-1.0″ ;
    :title = “Monthly version of HadISST sea surface temperature component” ;
    :institution = “Met Office, Hadley Centre for Climate Research” ;
    :source = “HadISST” ;
    :history = “09/11/2006 HadISST converted to NetCDF from pp format by John Kennedy” ;
    :references = “Rayner, N. A., Parker, D. E., Horton, E. B., Folland, C. K., Alexander, L. V., Rowell, D. P., Kent, E. C., Kaplan, A. Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century J. Geophys. Res.Vol. 108, No. D14, 4407 10.1029/2002JD002670″ ;
    :comment = “Data restrictions: for academic research use only. Data are Crown copyright see (http://www.opsi.gov.uk/advice/crown-copyright/copyright-guidance/index.htm)” ;
    :supplementary_information = “Updates and supplementary information will be available from http://www.hadobs.org” ;
    :contact = “j” ;
    > print.nc(nc4)
    dimensions:
    lon = 360 ;
    lat = 180 ;
    time = 1671 ;
    nv = 2 ;
    variables:
    float lon(lon) ;
    lon:long_name = “Longitude” ;
    lon:standard_name = “longitude” ;
    lon:units = “degrees_east” ;
    lon:actual_range = -179.5179.5 ;
    float lat(lat) ;
    lat:long_name = “Latitude” ;
    lat:standard_name = “latitude” ;
    lat:units = “degrees_north” ;
    lat:actual_range = -89.589.5 ;
    double time(time) ;
    time:long_name = “Time” ;
    time:bounds = “time_bnds” ;
    time:standard_name = “time” ;
    time:units = “hours since 1-1-1 00:00:0.0″ ;
    time:actual_range = 1638336017603232 ;
    time:delta_t = “0000-00-01 00:00:00″ ;
    double time_bnds(nv, time) ;
    float sst(lon, lat, time) ;
    sst:long_name = “Monthly 1 degree resolution SST” ;
    sst:standard_name = “sea_surface_temperature” ;
    sst:units = “degC” ;
    sst:add_offset = 0 ;
    sst:cell_methods = “time: lat: lon: mean” ;
    sst:scale_factor = 1 ;
    sst:_FillValue = -1e+30 ;
    sst:missing_value = -1e+30 ;
    sst:actual_range = -1.834.76296 ;
    sst:description = “HadISST 1.1 monthly average sea surface temperature” ;

    // global attributes:
    :Conventions = “CF-1.0″ ;
    :title = “Monthly version of HadISST sea surface temperature component” ;
    :institution = “Met Office, Hadley Centre for Climate Research” ;
    :source = “HadISST” ;
    :history = “09/11/2006 HadISST converted to NetCDF from pp format by John Kennedy” ;
    :references = “Rayner, N. A., Parker, D. E., Horton, E. B., Folland, C. K., Alexander, L. V., Rowell, D. P., Kent, E. C., Kaplan, A. Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century J. Geophys. Res.Vol. 108, No. D14, 4407 10.1029/2002JD002670″ ;
    :comment = “Data restrictions: for academic research use only. Data are Crown copyright see (http://www.opsi.gov.uk/advice/crown-copyright/copyright-guidance/index.htm)” ;
    :supplementary_information = “Updates and supplementary information will be available from http://www.hadobs.org” ;
    :contact = “j” ;
    >

    One interesting thing: the file uncompressed using Steve’s method is only 1/4 the size it should be:

    ls -lh
    total 720M
    -rw-r–r– 1 me me 206M May 5 08:30 HadISST_sst.nc.gz
    -rw-r–r– 1 me me 100M Jun 2 20:38 temp.nc
    -rw-r–r– 1 me me 414M Jun 2 20:35 ucHadISST_sst.nc

    I wonder if this is also an accident of using Linux.

  17. Not Sure
    Posted Jun 2, 2009 at 10:10 PM | Permalink

    Argh, WordPress ate a bunch of the R ’cause it had leading greater-thans. Here it is again.

    gz <- gzfile(“HadISST_sst.nc.gz”, “rb”)
    data <- readBin(gz, “raw”, 104857600)
    close(gz)
    tempfile <- file(“temp.nc”, “wb”)
    writeBin(data, tempfile)
    close(tempfile)
    library(ncdf)
    nc <-open.ncdf(“temp.nc”)
    nc2 <-open.ncdf(“ucHadISST_sst.nc”)
    library(RNetCDF)
    nc3 <-open.nc(“temp.nc”)
    nc4 <-open.nc(“ucHadISST_sst.nc”)
    print.ncdf(nc)

  18. Gary Strand
    Posted Jun 2, 2009 at 10:42 PM | Permalink

    The problem isn’t with bitness, it’s with R and/or the software to allow R to understand netCDF. The file itself is small – less than 1/2 GB. 32-bit Linuces should have no problem with it.

  19. Not Sure
    Posted Jun 2, 2009 at 11:12 PM | Permalink

    Exactly the same results with 32-bit Linux. Must be a Windows thing.

One Trackback

  1. [...] Another Bouquet for Geert at KNMI Climate Audit Posted by root 1 day 21 hours ago (http://www.climateaudit.org) Sst description quot hadisst 1 1 monthly average sea surface temperature quot global attributes comment quot data restrictions for academic research use only argh wordpress ate a bunch of the r 39 cause it had leading greater thans here it is again powere Discuss  |  Bury |  News | Another Bouquet for Geert at KNMI Climate Audit [...]

Follow

Get every new post delivered to your Inbox.

Join 3,299 other followers

%d bloggers like this: