Barabinsk, Russia 57N, 77E

I’ve been going through the process of reconciling gridded data and station data in one Russian gridcell. Most of my effort to date has been spent on creating tools for accessing and collating data archives into organized time series formats so that others don’t have to go through the same trials and tribulations of sorting out oddball data formats and can improve their analyses on their own gridcells if they want. I’ll post a variety of read scripts up in a day or two. I’ve been working with the gridcell 57.5N and 77.5E primarily because it happened to be adjacent to the Tarko-Sale gridcell, which I had posted on previously. It also appears to include only one station so reconciliation is easier and to have several archives that can be cross-checked.

I report here on 4 archives with station data on Barabinsk (GHCN v2, GISS, GSN,, two of which have daily information (GSN, There are some other archives – NDP040, NDP048, an early Jones version – which I’ll try to create tools for as well. I also report on 3 gridded versions – HadCRU3, HadCRU2, GISS. Other gridded versions include CRUTEM3 and CRUTEM2 and some early Jones versions.

There are puzzles surrounding not simply the gridded data, but even the GHCN v2 station data. The GHCNv2 archived version ends in 1989, but the GSN version – which reconciles to the GHCN version during their overlap, continues to the present, but is not incorporated in GHCN. The HadCRU3 gridded version ends in 1989 matching GHCN v2 except for 3 oddball monthly values in 2005; however HadCRU2 went to 2002 (and started 10 years earlier than HadCRU3.) It appears to use the updated Barabinsk information available at GSN, but ignored in HadCRU3. The GISS gridded version diverges from both.

At this point, I have no particular conclusions about this aspect of temperature history; I am merely trying to look at available information in an organized way. I will observe that these gridded temperature calculations are as much an accounting exercise as anything else. As an accounting exercise, audit trails “matters”. Thus regardless of whether or not any of this “matters” for policy, it is hard to get a favorable impression of how the audit trails are organized.

As a start, here is a spaghetti plot of annualized information from the 7 sources that I’ve collated so far:
3 gridded versions of 57N, 77E : HadCRU3, HadCRU2 and GISS;
4 station versions: GHCN v2 and GISS monthly; GSN and (calculated monthly from daily data.) “Adjusted GHCN not considered yet)

Some of the plots are not clearly represented in the spaghetti graph, because they are replicated by later colors – this HadCRU3 is scarcely visible on the graph below, not because it isn’t there, but because it’s overwritten. The graph below is all in 1961-1990 anomaly format (which I’ve calculated from monthly data.)

Figure 1. Spaghetti graph of annualized Barabinsk station and gridcell anomalies (1961-1990)

As a quick point, one observes that all 7 versions match quite closely in the reference period (1961-1990) and that there is some divergence away from the reference period. As an exercise to ensure that I was on the right track (and this was important for debugging my read routines), I did a close-up of the raw monthly data for all 4 station archives before anomaly calculation (using daily data in two cases averaging the max and min) for the period 1978-1980. This yielded virtually identical results for all 4 versions – so I feel confident that the GHCN v2, GISS, GSN and data for Barabinsk are all from the same original source and that later data from or GSN can be legitimately added to the GHCN v2 record (ending in 1989).

Figure 2. Raw Monthly Values for 4 Station Versions for 1978-1980

I calculated 1961-1990 normals for each of the 4 station versions separately. These are shown below and are consistent, adding further comfort to the identity of the underlying information in the 4 archives.

Figure 3. Monthly 1961-1990 Normals for the 4 Station Versions

Station Versions
To isolate differences in station versions, I show two formats below: one in spaghetti graph format which is useful for identifying similarities or differences in scale, but has the disadvantage that later colors overwrite earlier colors when series are identical; and a panel (plot.ts) format, which focuses on differences in start and end dates. The spaghetti graph for the 4 stations replicates the first spaghetti plot, but with 3 fewer series to focus analysis a little. One sees again that the versions closely align in the reference period and have virtually identical scales. The panel version (see below) shows start and end better.

Figure 4. Spaghetti Graph for 4 station versions of Barabinsk.

Here is the same information in panel (plot.ts) format. You see clearly that the start and end dates of the GISS and GHCN versions match, while the and especially GSN versions continue to the present. The GSN and versions commence in 1925, while the GISS and GHCN versions include some earlier data; I notice here that there is a hiatus of about 2 years in the two archives based on daily information around 1960 that is not present in GHCN. Re-examining GHCN, GHCN has three versions on file – 296120002 has the hiatus around 1960, while the 296120000 and 296120001 versions do not. I have not seen any information on the varying provenance of these three versions and how the GHCN 0 and 1 versions have information not present in the meteo,ru or GSN archives.

Figure 5. Panel Graph for 4 station versions of Barabinsk

The excellent documentation for NDP040 records several station moves for Barabinsk as follows:

29612 MOVE 1939 5 21 2 N
29612 MOVE 1949 10 20 4 NW
29612 PRCP 1955 9 19
29612 MOVE 1958 12 4 0 NE

Did any of this relate to the 1959 hiatus or contribute to a possible discontinuity around 1959?

Gridded Versions
Next here is a comparison of HadCru2 and HadCRU3 versions. These are essentially identical over the period 1901-1989. However, HadCRU2 also has coverage from 1891-1901 and from 1990-2002, that is excluded from HadCRU3.
Figure 6. Comparison of HadCRU2 (red) and HadCRU3 (black)

This is particularly curious because the HadCRU2 extension appears to have been derived from the extended daily information from Barabinsk – why wouldn’t this be used in HadCRU3? The figure below shows the HadCRU versions against Barabinsk GSN – with the segments before and after the 1960 hiatus colored separately. The third tranche from 1901-1925 is from the early GHCN tranche not present in the oher archives, colored a different blue shade.
Figure 7. CRU versions overlaid against selected Barabinsk station data versions.

Finally, here is a comparison of the GISS gridded version for the 2×2 cell containing Barabinsk against the same station versions. There are puzzling differences to the corresponding CRU gridcell and with the underlying station data. Although the GISS gridded data matches Barabinsk closely through the 1961-1990 reference period, the two series have recently been “diverging” with the “divergence factor” averageing over 1 deg C in the 2000s – with the Hansen gridded version being the warmer. However the Hansen gridded version is also warmer than GSN version. I don’t know why at present – perhaps it is related to the GHCN adjustment, which I’ll look at on another occasion.

Figure 8. GISS gridded against GSN station data (and early GHCN tranche)


  1. John Hekman
    Posted Mar 6, 2007 at 11:06 AM | Permalink

    This is great, Steve. The surface temperature data are so important. Have you thought about starting a cooperative organization on the net to share the work and document the analysis that is being done on this issue? It might get enough help to build an alternative, audited, data set of world temperatures to test the validity of the Jones version.

    It might also get government funding.

  2. jae
    Posted Mar 6, 2007 at 11:50 AM | Permalink

    It is interesting that none of the plots show any warming since about 1978.

  3. JerryB
    Posted Mar 6, 2007 at 12:44 PM | Permalink

    I do not recall a specific statement from GISS on the question,
    but my impression is that except for the USHCN stations, GISS
    uses only “raw” GHCN station data, not GHCN adjusted station
    data. So, except for GISS estimates for missing data, GISS
    numbers for Barabinsk should match GHCN V2, unless you used
    the ‘after combining sources as same location’ option, or
    the ‘after homogeneity adjustment’ option, when downloading
    from GISS.

  4. Steve Sadlov
    Posted Mar 6, 2007 at 12:53 PM | Permalink

    RE: #2 – PDO flip?

  5. bernie
    Posted Mar 6, 2007 at 1:28 PM | Permalink

    I thought I would try to help out in the terms of the UHI effect. I looked for population figures for this location.
    The latest numbers are around 31K and it appears to have shrunk noticeably from about 1970. I did find one interesting source – though I am sure there are others – that lays
    out data for urbanization in Siberia from Stalin to the late 80s. Overall there appears to be a doubling of the population from
    its apparent founding in 1917 though the weather station predates this.
    However, for this particular town the population appears to be pretty steady since the 1939 until its decline in the 80s.
    It was founded in 1917, so I imagine you can assume that the growth was initially brisk due to the Trans-Siberian
    station and then flattened out. It would be interesting to know whether there was any major change in the land use during 60s and 70s and what precipitated the pretty significansignificant decline in the 80s.


    There is another weather station apparently close by at KUYBYSHEV/BEZENCUK This station drew the attention of the US Army for
    whatever reason. It has a different altitude form Barabinsk – 40m and 120m.

  6. MarkW
    Posted Mar 6, 2007 at 2:26 PM | Permalink

    Here’s another thing we have to be carefull when trying to figure out UHI.
    Just because an areas population is declining, is not evidence that the UHI signature is decreasing.
    At least it won’t decrease as fast as it rose.
    Sure there are less people running motors and creating heat, but the roads, buildings, parking lots, etc. still exist.
    They didn’t get torn up just because the people left.
    Over the years, the area will return to nature, but this process will take decades.

  7. bernie
    Posted Mar 6, 2007 at 4:53 PM | Permalink

    No I understand the UHI effect are to do with local energy consumption per unit of area and the immediate physical location.
    However, I in part noted the population numbers because of the pattern in the
    figures circa 1980. It is only suggestive but without an actual view of the physical site who knows.

    I guess I am more intrigued by another weather station that is very close by but apparently not in the data set.

  8. Posted Mar 6, 2007 at 6:00 PM | Permalink

    Relevant to your ploughing on thru the huge landscape of USSR T data.
    I have just constructed a new global map of GHCN minus CRU trend differences for the period 1976-2003, arising out of that 2005 Vose et al paper, “An intercomparison of trends in surface air temperature analyses at the global,
    hemispheric, and grid-box scale”.
    You and you readers might be amazed at the large differences between GHCN and CRU (CRUT2) all over the globe, not just over the old USSR. The two datasets, on which huge sums of taxpayer monies have been spent over two decades, are discordant at a rate greater than global warming, over a majority of grid cells for which co-located data is present. This is for a period chose by the Vose et al authors.

  9. jae
    Posted Mar 6, 2007 at 6:56 PM | Permalink

    8: WOW. How on earth can that happen, when (I presume) they use the same raw data?

  10. Steve McIntyre
    Posted Mar 6, 2007 at 7:44 PM | Permalink

    #8. Readers of CA should check out Warwick’s blog. One of Warwick’s characteristic techniques – second nature to someone in mineral exploration, but not seemingly not used by the Team – is to look for anomalies e.g. gridcells with extreme trends and then analyze the data.

  11. John Norris
    Posted Mar 6, 2007 at 9:13 PM | Permalink

    re #8: I read Warwick Hughes’ blog. Not sure what it leads to yet. I need some help with that.

    When comparing the two data sets, (that I’m guessing at least one is an input towards the Fourth Assessment Report Summary for Policy makers SPM-3 chart that shows about .7 or .8 degrees c increase from 1900 – 2000) you have identified that 57% of the gridcells differ by at least 1 full degree c, per century, split pretty evenly between + and -.

    Is it a problem to claim an uncertainty of less than +/- .1 degree c in the SPM-3 chart (I eyeballed their uncertainty) in measured global average surface temperature when over half your data when compared to another data set has differences of over an order of magnitude greater than your uncertainty?

  12. Hans Erren
    Posted Mar 7, 2007 at 3:02 AM | Permalink

    re 10:
    The difference between climate science and mineral exploration is that a false anomaly doesn’t have adverse impact to climate science (yet). In mineral exploration an empty hole drilled on a false anomaly can bankrupt a company. That’s why in mineral exploration every anomaly is squeezed out to the last bit to see if it’s a processing artifact before finally marking the map with an X (drill here).

  13. Posted Mar 7, 2007 at 3:57 AM | Permalink

    Re #9 Jae, We have to understand that the various big groups “do their own thing” with the raw data, alter it here and there, chop bits off etc.
    Then the period is a bit short, so that trend numbers change rapidly.
    However, Vose et al choose this period, its their data, I am just presenting the trend differences clearly without trying to cover up anything.

  14. Michael Jankowski
    Posted Mar 7, 2007 at 8:11 AM | Permalink

    Re Warwick:

    Then studying Fig 6 we began pondering the Vose et al statement in their section 4 [14] that, “..9.4% of all grid-box trends differ by more than 0.100 [degree C per decade] in both magnitude and sign.”
    It soon became clear that this clever use of the calming number 9.4% was in fact concealing the fact that a vastly greater number of grid points varied by more than 0.1 degree C per decade, regardless of sign. Vose et al are trying to show that GHCN and CRU are similar, so it does not matter if for any grid point one or the other is higher, it is DIFFERENCE that is the issue, so sign is a red herring…
    …Last year we obtained a file of the various global grid point trends re Fig 5 from Russell Vose and found that in fact a staggering 57% of grid points differed by more than 0.1, either + or -.
    So, 57% of grid points differ by greater than the magnitude of century long global warming…

    Great observation and catch of Vose et al’s tricky wording.

One Trackback

  1. […] early 2007 here, I’d observed that the HadCRU series for gridcell 57N 77E (containing the single Siberian […]

%d bloggers like this: