The section of Hansen’s code that we’d been looking at immediately prior to the dump of relatively unannotated code was how Hansen combined scribal versions of different stations in a 2-column case – which looks to contain a material error already discussed without the benefit of source code and which is going to be examined further. We had just started trying to figure out how Hansen dealt with multiple series. This appears to be considered in the Step 1 program comb_records.py, which employs a subroutine get_best in which station versions are ranked accorded to provenance as follows:
‘MCDW': 4, ‘USHCN': 3, ‘SUMOFDAY': 2, ‘UNKNOWN': 1
I am unaware of any mention of this ranking procedure in Hansen et al 1999, 2001. Hansen mentions MCDW and U.S. First Order (which seems to be mostly ASOS) as follows:
Second, updates of the GHCN data covering the most recent several years include only three component data sets [Peterson and Vose, 1997]: (1) up to about 1500 of the global MCDW stations that report monthly data over the Global Telecommunications System or mail reports to NCDC, (2) up to about 1200 United States Historical Climatology Network stations, which are mostly rural; (3) up to about 370 U.S. First Order stations, which are mostly airport stations in the United States and U.S. territories in the Pacific Ocean. Third, the update for the final (current) year is based mainly on MCDW stations
We’ve talked about USHCN in the past, but not much about MCDW or First Order (ASOS) networks. Hansen provides three lists that pertain to these three networks (the lists occurring in both Step 0 and Step 1 files):
The first table ushcn.tbl is a concordance with 1221 rows (equal to the number of USHCN stations), linking USHCN identification numbers with GHCN station.inv numbers (carried forward into the GISS station.inv. ) This is the second such concordance of USHCN and GHCN identifications numbers to appear online – an earlier concordance being posted here http://www.climateaudit.org/data/ushcn/details.dat . (I’ve not compared the concordances yet.) So it’s nice to see GISS’ contribution. I checked the 1221 station IDs in ushcn.tbl for inclusion in the 7364 ids in GISS station.inv and all were included.
The next list mcdw.tbl is more problematic. There are 1502 stations in the mcdw.tbl list (which dates from 1998 BTW), which is consistent with the number in Hansen et al 1999. The USHCN list was a concordance of USHCN numbers with GHCN numbers. The MCDW table also appears to be a concordance, but there are a couple of big differences with the USHCN series where the USHCN ids reconciled to USHCN listings and the GHCN ids rconciled to the GISS station.inv ids. In this case, neither happened.
Although 1301 GHCN ids in the concordance matched GISS station.inv ids, a total of 201 identifications did not – raising a couple of questions: where did these new IDs come from? what is their purpose? how are they used?
Also where do the 9-digit “MCDW” numbers come from. There doesn’t appear to be an online concordance of MCDW ids. There are individual reports at http://www1.ncdc.noaa.gov/pub/data/mcdw/ . In these reports, 5-digit WMO numbers are used; the first 3-digits are country codes (which differ somewhat from other lexicons.) There is a 4th digit in the concordance, which is usually 0. Again I don’t know the function. In the USHCN case, there was a list of stations in the network: if someone can identify the provenance of the MCDW numbers in the Hansen concordance, I’d appreciate it.
I’ve spot checked some individual GISS records back to MCDW publications and have been able to trace some individual values back.
The MCDW stations appear to be primarily airports, including many international airports. For the ROW, these constitute the lion’s share of information since 1990.
The other new list is sumofday.tbl, which also appears to be a concordance of GHCN numbers of other identifications. There are 371 rows in this concordance – consistent with the number of First Order stations referred to in Hansen et al 1999. These are hourly-stations, most of which are ASOS since the 1990s (see post on the HO-83 thermometer).
Again, about 20 ids in this table do not reconcile with any GISS station.inv numbers. Also I’ve been unable so far to locate a data inventory in which all the station codes in the other side of the concordance could all be located.
AGU has specific policies on data provenance. Had Hansen observed these protocols, the exact digital source of the data would be specified – something that hasn’t happened so far, although they’ve indicated an attempt to improve their documentation.