I’m posting up another script for unzipping a file in R (in part so that I can keep track of this things as well.) I’m trying to figure out how to download the Australian data in R and reviewing some prior successful efforts (which have relied on Nicholas).
Here is how one can get at the NOAA gridded data directly from R. First download the file to a temporary location. I’m not sure what mode=”wb” does, but it’s something you have to do.
The following will get it into an ASCII file. The unzipping of the Russian meteo script uses the scan function; here readLines works. The file Data is something that can be handled with ordinary techniques.
handle < – gzfile("anom-grid2-1880-current.dat.gz");
Data <- readLines(handle);
length(Data) # 331359
In this case, the data comes out as 12 values per line; 217 lines per month-year combination. There are (217-1)*12=2592 = 72*36 gridcell values. There are 331142 lines in the file currently. The data comes out as latitude in the hour hand (N to S in 5 degree increments) and longitude as the minute hand (W to E from the Dateline in 5 degree increments). To make a collated version of time series with gridcells in each column in the same hour hand-minute hand order (which I try to use consistently), I first identify and remove the 1, 217, … lines with month-year information and then transform the data through matrix operations.
noaa<-cbind(as.numeric(substr(Data,1,6)), as.numeric(substr(Data,7,12)),as.numeric(substr(Data,13,18)),as.numeric(substr(Data,19,24)), as.numeric(substr(Data,25,30)), as.numeric(substr(Data,31,36)),as.numeric(substr(Data,37,42)),as.numeric(substr(Data,43,48)),as.numeric(substr(Data,49,54)), as.numeric(substr(Data,55,60)),as.numeric(substr(Data,61,66)),as.numeric(substr(Data,67,72)) )
dim(noaa) #329616 12
# 329616*12/2592 =1526
noaa<-array(noaa,dim=c(2592,length(noaa)/2592 ) );dim(noaa)# 2592 1526
noaa<-t(noaa) #1526 2592
Nicholas also looked recently at *.Z files, which he described as an obsolete compression format, not supported at present in R. It is used in some climate data sets e.g. http://cdiac.ornl.gov/ftp/tr055/sta60.dat.Z. Nicholas wrote a routine which he said is slow and which I haven’t tried yet. If it’s a one-off analysis, it’s easy enough to download and unzip manually. The need for automated unzipping occurs when the data is updated or if you need to call individual stations. I’ll revisit the *.Z files if and when I get to this situation.
There are a couple of BOM files at ftp://ftp.bom.gov.au/anon/home/bmrc/perm/climate/temperature/annual. As an exercise, I tried to see if I could modify Nicholas’ methods to unzip this data in R, but so far have been unsuccessful. I’m sure that Nicholas will have an answer.