The Harry Read_Me File

A CA readr has organized the Harry_Read_Me file here. Take a look.

And we thought GISTEMP was bad. And it’s not like the underlying calculations are very complicated.

77 Comments

  1. dicentra
    Posted Nov 23, 2009 at 10:44 AM | Permalink

    Forgive me for engaging in snark, but “hide the decline” tee-shirts are now available.

    http://blogs.news.com.au/dailytelegraph/timblair/index.php/dailytelegraph/comments/wear_the_decline/

    God bless the Internet.

  2. Posted Nov 23, 2009 at 11:03 AM | Permalink

    Its better to start at 0

    http://di2.nu/foia/HARRY_READ_ME-0.html

    And if anyone wants to help me split part 35 in to sensible smaller chunks I’ll welcome their help…

    • Follow the Money
      Posted Nov 25, 2009 at 10:10 PM | Permalink

      Francis, if you wouldn’t mind–what is the “Harry read me” code intended to produce? What would be the title of the end product? Because, please correct me otherwise, there appears to be precipitation and other data at issue in this code. So it would not seem to be the CRU temperature record graph. But Harry is putting so much time into it? Were the temp records insufficient for a rise, so other data were monkeyed in, fulfilling the original programmer’s belief that the temp records must be inadequate since CO2 is undeniably rising?

      • Follow the Money
        Posted Nov 25, 2009 at 10:34 PM | Permalink

        By the way, and showing my less than amateur status, if this code is for a model, I apologize for the question. But, in my way of thinking, he is spending too much time to fix an old modeling program when there is so much money to whip up new ones.

  3. Ace
    Posted Nov 23, 2009 at 11:26 AM | Permalink

    http://www.realclimate.org/index.php/archives/2009/11/the-cru-hack/

    No doubt, instances of cherry-picked and poorly-worded “gotcha” phrases will be pulled out of context. One example is worth mentioning quickly. Phil Jones in discussing the presentation of temperature reconstructions stated that “I’ve just completed Mike’s Nature trick of adding in the real temps to each series for the last 20 years (ie from 1981 onwards) and from 1961 for Keith’s to hide the decline.” The paper in question is the Mann, Bradley and Hughes (1998) Nature paper on the original multiproxy temperature reconstruction, and the ‘trick’ is just to plot the instrumental records along with reconstruction so that the context of the recent warming is clear. Scientists often use the term “trick” to refer to a “a good way to deal with a problem”, rather than something that is “secret”, and so there is nothing problematic in this at all. As for the ‘decline’, it is well known that Keith Briffa’s maximum latewood tree ring density proxy diverges from the temperature records after 1960 (this is more commonly known as the “divergence problem”–see e.g. the recent discussion in this paper) and has been discussed in the literature since Briffa et al in Nature in 1998 (Nature, 391, 678-682). Those authors have always recommend not using the post 1960 part of their reconstruction, and so while ‘hiding’ is probably a poor choice of words (since it is ‘hidden’ in plain sight), not using the data in the plot is completely appropriate, as is further research to understand why this happens.

    • anon
      Posted Nov 25, 2009 at 3:34 PM | Permalink

      Excellent. Someone who understands Phil-speak. What does “(since it is ‘hidden’ in plain sight)” mean?

  4. JasonScando
    Posted Nov 23, 2009 at 11:27 AM | Permalink

    Wow, just scroll through some of

    http://di2.nu/foia/HARRY_READ_ME-35.html

    This guy is faced with horrific errors at every turn and does not resolve most of them.

  5. chainpin
    Posted Nov 23, 2009 at 11:31 AM | Permalink

    I like this one:

    19. Here is a little puzzle. If the latest precipitation database file
    contained a fatal data error (see 17. above), then surely it has been
    altered since Tim last used it to produce the precipitation grids? But
    if that’s the case, why is it dated so early? Here are the dates:

    /cru/dpe1a/f014/data/cruts/database/+norm/pre.0312031600.dtb
    – directory date is 23 Dec 2003

    /cru/tyn1/f014/ftpfudge/data/cru_ts_2.10/data_dec/cru_ts_2_10.1961-1970.pre.Z
    – directory date is 22 Jan 2004 (original date not preserved in zipped file)
    – internal (header) date is also ‘22.01.2004 at 17:57’

    So what’s going on? I don’t see how the ‘final’ precip file can have been produced from the ‘final’ precipitation database, even though the dates imply that. The obvious conclusion is that the precip file must have been produced before 23 Dec 2003, and then redated (to match others?) in Jan 04.

  6. EddieO
    Posted Nov 23, 2009 at 11:35 AM | Permalink

    BBC are allowing comments on an obscure part of their website.

    http://www.bbc.co.uk/blogs/paulhudson/2009/11/climategate-cru-hacked-into-an.shtml

  7. Posted Nov 23, 2009 at 11:42 AM | Permalink

    After spending a lot of time with HARRY_READ_ME.TXT I am going to put my money for the anonymous leaker on Harry. If I had his job I know I would be leaking like a sieve.

    • Sean
      Posted Nov 23, 2009 at 12:35 PM | Permalink

      Precisely what I was thinking. Actually, I feel for the guy. He is approaching the hardest of all tasks (sorting out a mess of someone else’s unexplained code) with tremendous equanimity. I would have committed hari kari by about the 50th line.

      • Paul Cummings
        Posted Nov 23, 2009 at 5:19 PM | Permalink

        This from Harry
        Two main filesystems relevant to the work:

        /cru/dpe1a/f014
        /cru/tyn1/f014

        why not foia?

  8. EW
    Posted Nov 23, 2009 at 12:40 PM | Permalink

    Now that is what I call redundant. Raypierre of the RC fame expresses his concerns about the safety of climate science and scientists thusly:

    “…this illegal act of cyber-terrorism against a climate scientist (and I don’t think that’s too strong a word) is ominous and frightening. What next? Deliberate monkeying with data on servers? Insertion of bugs into climate models?
    Or at the next level, since the forces of darkness have moved to illegal operations, will we all have to get bodyguards to do climate science?

    As the Harry readme shows the quality of original coding, is there really need to insert bugs into the Team codes?

    • Sean
      Posted Nov 23, 2009 at 1:01 PM | Permalink

      He should be more worried that someone would hack in and take all the bugs OUT.

    • Dave
      Posted Nov 24, 2009 at 8:01 AM | Permalink

      “act of cyber-terrorism against a climate scientist (and I don’t think that’s too strong a word)”

      I think ‘scientist’ is far too strong a word in this case…

  9. Richard Henry Lee
    Posted Nov 23, 2009 at 1:09 PM | Permalink

    Harry is most likely Ian Harris who works for Phil Jones at CRU. I doubt that he is the leaker since his READ ME is full of expletives which he probably would have deleted. But I could be wrong.

    The CRU temperature data is supposed to be the gold standard, yet the earlier paper records are “missing” and the computer records are terrible.

    Computer programmers at Universities are often not skilled in documenting their work and when they depart, they leave a mess behind. In this case, hapless Harry is trying to make sense of it. Since he is not entirely successful, he has to make some assumptions and fill in the gaps. The READ ME file is a diary of his frustrations in trying to update the Temperature record and it does not inspire confidence in the final results.

    This could have repercussions since the EPA and the Supreme Court relied on this data in their decisions.

  10. Posted Nov 23, 2009 at 2:06 PM | Permalink

    The data management issues are fascinating – I’m very curious to see what the outcome is. Will the CRU 2.1 and CRU 3.0 be seen as “tainted” going forward, and impact previosly published works? We’re tracking responses across the world here:

    ThisNewz CRU search

  11. Shona
    Posted Nov 23, 2009 at 2:24 PM | Permalink

    From what I’ve read of the Harry file (I too wondered if he was the whistleblower), I think there may be another problem: these people’s day job is holding the world’s database on climate right? Then this isn’t fit for purpose. This is not a database, it’s a mess.

    I’m beginning to think that they COULDN’T have released the data even if they had wanted to. They don’t have it.

    This is massive. And the more I read about it, the more I think this is a failure of history book proportions.

    • Calvin Ball
      Posted Nov 23, 2009 at 3:04 PM | Permalink

      In a broader sense, that’s exactly the issue with CA and the IPCC, and the entire controversy. This is way too important an issue to be left in the hands of people who, regardless of the sophistication of their understanding of the science, are so sloppy with the management of the facts.

    • 40 Shades of Green
      Posted Nov 23, 2009 at 5:20 PM | Permalink

      If this was the private sector, they would lose the contract.

  12. FTM
    Posted Nov 23, 2009 at 2:29 PM | Permalink

    Woo hoo, the following is all the way back to 2000. I had thought a pr outfit was managing the reeducation camp themes. Turns out the Team was competent in this area themselves.

    From: Phil Jones
    To: “Michael E. Mann” , “Folland, Chris”
    Subject: Re: FW: Mann etal
    Date: Fri, 11 Aug 2000 13:40:30 +0100
    Cc: jfbmitchell@xxxxxxxxx.xxx,k.briffa@xxxxxxxxx.xxx

    Chris and John (and Mike for info),
    I’m basically reiterating Mike’s email. There seem to be two lots of
    suggestions doing the rounds. Both are basically groundless.

    1. Recent paleo doesn’t show warming.

    This basically stems back to Keith Briffa’s paper in Nature in 1998
    (Vol 391, pp678-682). In this it was shown that northern boreal forest
    conifers don’t pick up all the observed warming since about the late
    1950s. ….. We don’t have paleo datafor much of the last 20 years. It would require tremendous effort and resources to update a lot of the paleo series because they were collected during the 1970s/early 1980s.

    It is possible to add the instrumental series on from about 1980 (Mike
    sought of did this in his Nature article to say 1998 was the warmest of
    the millennium – and I did something similar in Rev. Geophys.) but there
    is no way Singer can say the proxy data doesn’t record the last 20 years
    of warming, as we don’t have enough of the proxy series after about 1980…..

    2. Everyone knows it was cooler during the Little Ice Age and warmer in
    the Medieval Warm Period.

    All of the millennial-long reconstructions show these features, but they
    are just less pronounced than people believed in the 1960s and 1970s,
    when there was much less paleo data and its spatial extent was limited
    to the eastern US/N.Atlantic/European and Far East areas. ….The typical comments I’ve heard, generally relate to the MWP, and say that crops and vines were grown further north than they are now (the vines grown in York in Viking times etc). Similarly, statements about frost fairs and freezing of the Baltic so armies could cross etc. Frost fairs on the Thames in London occurred more readily because the tidal limit was at the old London Bridge (the 5ft weir under it). The bridge was rebuilt around the 1840s and the frost fairs stopped. If statements continue to be based on historical accounts they will be easy to knock
    down with all the usual phrases such as the need for contemporary
    sources, reliable chroniclers and annalists, who witnessed the events
    rather than through hearsay. As you all know various people in CRU
    (maybe less so now) have considerable experience in dealing with this
    type of data. Christian Pfister also has a lifetime of experience of
    this. There is a paper coming out from the CRU conference with a
    reconstruction of summer and winter temps for Holland back to about
    AD 800, which shows the 20th century warmer than all others. Evidence is
    sparser before 1400 but the workers at KNMI (Aryan van Engelen et al.)
    take all this into account.

    I hope this is of use and hasn’t been a total waste of time.

  13. FTM
    Posted Nov 23, 2009 at 2:39 PM | Permalink

    Plus there is the additional uncertainty, discussed on the final page of the supplementary information, associated with linking the proxy records to real temperatures (remember we have no formal calibration, we’re just counting proxies — I’m still amazed that Science agreed to publish something where the main analysis only involves counting from 1 to 14! :-)).

    http://www.anelegantchaos.org/cru/emails.php?eid=1029&filename=1254345329.txt

    • GaryC
      Posted Nov 24, 2009 at 3:54 AM | Permalink

      What are the odds that the Science editor assigns Don Rickles to review the next paper submitted from this group?

      This is the kind of disrespect up with which the magazine shall not put.

  14. Posted Nov 23, 2009 at 2:52 PM | Permalink

    This whole thing must be any off-piste scientis’st worst nightmare.

    You climb the greasy pole for 30 yrs, become top of the tree and then someone leaks all your thoughts, secrets, desires and ambitions to millions of your detractors.

    And the crap data it’s all based on – which you’ve been protecting very aggressively – gets picked to bits over the internet by the same nit-pickers.

    I have a strange feeling of pity for those who are going to get squelched for their part in this.

    Perhaps a mea culpa from them is the only way out?

  15. FTM
    Posted Nov 23, 2009 at 2:55 PM | Permalink

    Hi Phil,
    is this another witch hunt (like Mann et al.)? How should I respond to the below? (I’m in the process of trying to persuade Siemens Corp. (a company with half a million employees in 190 countries!) to donate me a little cash to do some CO2 measurments here in the UK – looking promising, so the last thing I need is news articles calling into question (again) observed temperature increases – I thought we’d moved the debate beyond this, but seems that these sceptics are real die-hards!!). Kind regards, Andrew

    http://www.anelegantchaos.org/cru/emails.php?eid=1041&filename=1254832684.txt

    From: Michael Mann
    To: Phil Jones
    Subject: Re: attacks against Keith
    Date: Wed, 30 Sep 2009 11:06:20 -0400
    Cc: Gavin Schmidt , Tim Osborn

    Hi Phil,

    lets not get into the topic of hate mail. I promise you I could fill your inbox w/ a very long list of vitriolic attacks, diatribes, and threats I’ve received.

    Its part of the attack of the corporate-funded attack machine, i.e. its a direct and highly intended outcome of a highly orchestrated, heavily-funded corporate attack campaign. We saw it over the summer w/ the health insurance industry trying to defeat Obama’s health plan, we’ll see it now as the U.S. Senate moves on to focus on the cap & trade bill that passed congress this summer. It isn’t coincidental that the original McIntyre and McKitrick E&E paper w/ press release came out the day before the U.S. senate was considering the McCain
    Lieberman climate bill in ’05.

    we’re doing the best we can to expose this. I hope our Realclimate post goes some ways to exposing the campaign and pre-emptively deal w/ the continued onslaught we can expect over the next month.

    http://www.anelegantchaos.org/cru/emails.php?eid=1027&filename=1254323180.txt

    “Paranoya will destroy ya”

  16. Posted Nov 23, 2009 at 2:57 PM | Permalink

    Update from BishopsHill

    http://bishophill.squarespace.com/blog/2009/11/23/harrabin-on-the-cru-hack.html

  17. Posted Nov 23, 2009 at 3:01 PM | Permalink

    Here’s my first bit of code analysis – http://www.di2.nu/200911/23a.htm – based not just on the Harry_read_me but on the files it refers to.

    Here’s my conclusion:
    I’ve examined two files in some depth and found (OK so Harry found some of this)

    * Inappropriate programming language usage
    * Totally nuts shell tricks
    * Hard coded constant files
    * Incoherent file naming conventions
    * Use of program library subroutines that appear to be
    o far from ideal in how they do things when they work
    o do not produce an answer consistent with other way to calculate the same thing
    o but which fail at undefined times
    o and where when the function fails the the program silently continues without reporting the error

    AAAAAAAAAARGGGGGHHHHHHHH!!!

    • Calvin Ball
      Posted Nov 23, 2009 at 3:06 PM | Permalink

      Are you sure you’re not looking at the source for Windows?

      • boballab
        Posted Nov 23, 2009 at 6:18 PM | Permalink

        Actually I believe that even the programmers for Windows is laughing at this mess.

  18. John Wright
    Posted Nov 23, 2009 at 3:25 PM | Permalink

    I am quite unqualified to begin to decipher the codes, but cursory reading of the accompanying comments makes it apparent that the man’s task was to bend the data until it complied with predetermined “requirements” on the good old GIGO principle. That’s what he was sweating over. I think it’s not impossible that we have stumbled across a mole – not a hacker. Having it downloaded from Russia would have been a way of covering his trail.

  19. Wolfgang Flamme
    Posted Nov 23, 2009 at 3:40 PM | Permalink

    Well finally that frames a context for ‘value added product’.

  20. Calvin Ball
    Posted Nov 23, 2009 at 5:05 PM | Permalink

    Not to be a buzzkill, but it appears that so far, we’ve found:

    1. very poorly written code
    2. very poorly managed programming projects
    3. incriminating comments

    Actually demonstrating that the executable code does what the comments say it does is going to be a LOT of work. So far, I think it’s safe to say that at best, this indicates extremely poor management. Finding worse may very well happen, but it’s going to take weeks, if not months, for people to actually untangle the executable code, and develop flowcharts or something else indicating ill intent.

    And even then, Jones et. al. will always have the excuse of managerial incompetence. It’s not beyond credulity to simply claim that he didn’t understand what the program actually does.

    When it’s all done, expect the programmer to be the scapegoat.

    • Ace
      Posted Nov 24, 2009 at 11:22 AM | Permalink

      Not just that, but we don’t even know what this data was used for. It could have just been an exercise, it could have been discarded, or it could have been re-written prior to publication

  21. Robin Melville
    Posted Nov 23, 2009 at 5:28 PM | Permalink

    Calvin Ball

    1. very poorly written code
    2. very poorly managed programming projects
    3. incriminating comments

    Actually, the situation is worse than this. Tom_R on WUWT has found some code where an array of “fudge” factors (described in the comments as “VERY ARTIFICIAL”) have been applied to dendro data to increase output temperatures in more recent data and decrease it in inconveniently high older points.

    Whatever the implications of the email traffic, the real “smoking gun” lies in the data and program files from the leak. These speak directly to the provenance and accuracy of the datasets which have formed the bases of current climate science hypotheses.

  22. Jack
    Posted Nov 23, 2009 at 5:45 PM | Permalink

    Oh give me a break. You don’t know who was working with the code (grad student? post doc? researcher?) and don’t know his/her background. You don’t know what the work was for (introduction to working with the data? Exercise for a class? For fun?). It’s beyond ridiculous to take some random person’s personal work diary and hold it up as an example showing why an entire body of work coming from decades of study by thousands of individuals is a hoax. This is absurd.

    • HankHenry
      Posted Nov 23, 2009 at 6:30 PM | Permalink

      Good point, Jack. That’s kind of the problem, isn’t it. And why is it we don’t know more about the code and data?

    • RomanM
      Posted Nov 23, 2009 at 6:48 PM | Permalink

      At least figure out what is being discussed here before postulating nonsense ((introduction to working with the data? Exercise for a class? For fun?).

      The Harry file is 701 KB and details the travails of trying to fix the programs over a period of three years from 2006 to 2009.

      try reading some of it first. It’s not that difficult to actually find some genuine information about the situation before spouting off in an obviously uninformed fashion.

    • Dave Dardinger
      Posted Nov 23, 2009 at 7:03 PM | Permalink

      It’s clear the programmer, who apparently isn’t Harry was tasked to prepare the data for the new version 3.0 of their output. I feel sorry for him, but it’s apparent he was able to interact directly with Phil Jones some of the time. But there’s no reason to doubt that he had a very smelly swamp of data to traverse.

    • malcolm
      Posted Nov 23, 2009 at 7:12 PM | Permalink

      “Hapless Harry” is no grad student or contract programmer; he is Dr. Ian (Harry) Harris of the CRU, whose specialty is “dendrochronology” and “data manipulation”(!). See: http://www.cru.uea.ac.uk/cru/people/ .

    • John M
      Posted Nov 23, 2009 at 10:21 PM | Permalink

      Perhaps,

      But wouldn’t it be nice if the actual bona fide code was readily available?

      Then we wouldn’t have to try to find it amongst all the other garbage.

    • Dave
      Posted Nov 24, 2009 at 8:07 AM | Permalink

      Jack> The point is that it’s all irrelevant. ‘Science’ is open and replicable. ‘Climate science’ is neither open, nor, now we have got the secret data, is it verifiable or replicable.

      Applying the same standards of proof, you owe me a million dollars. Watch:

      1) I have a set of data that says you owe me a million dollars.
      2) But you can’t see it.

      See?

    • Ace
      Posted Nov 24, 2009 at 11:24 AM | Permalink

      Thank you Jack. Exactly what I have been trying to say. Most of these people are finding a conspiracy on just about every line, but they don’t even know what they are looking at for a start.

      This is ridiculous

  23. Raven
    Posted Nov 23, 2009 at 6:14 PM | Permalink

    Can anybody tell me why solar and precipitation data are required to calculate the global mean temperature? I would have expected that temps were sufficient.

    • Posted Nov 23, 2009 at 8:40 PM | Permalink

      Raven, on my own computer I have lots of experimental results that sit in several distinct data files (because they are measures of different things, resulting from different phases of some protocol). Programs then load up some subset of them, as required for what I’m doing at the time.

      This is just a guess, but I imagine they had the same kind of situation. CRU didn’t simply maintain a worldwide panel on temparatures, they also had research teams putting that together with other datasets for the specific work of various papers.

      So various data of various kinds lay about on the hard drives. Whoever did the actual computer work knew what to load up for particular analyses. They didn’t document that knowledge well, and when they left it was effectively gone.

      I have sometimes received old data (and here I mean just five years old) from people who cannot precisely remember what columns are what; or in several different files corresponding to several different treatments; and the original authors confess they are unsure which file is which treatment. In such cases I have to go to their papers and experiment–“audit” their results–until I am confident I know what’s what.

      I’m afraid this kind of thing is pretty common with low-stakes data sets from experiments that were relatively cheap. In this case, with a data set so obviously central to many hundreds of academic projects, it is pretty amazing. (If indeed we are seeing a sort of breakdown in the passing on of knowledge about that temperature data set.)

      • Posted Nov 24, 2009 at 2:05 AM | Permalink

        I’m afraid this kind of thing is pretty common with low-stakes data sets from experiments that were relatively cheap. In this case, with a data set so obviously central to many hundreds of academic projects, it is pretty amazing. (If indeed we are seeing a sort of breakdown in the passing on of knowledge about that temperature data set.)

        That’s exactly it. Once this thing became a key part of the IPCC etc. they should have spent the money tidying it up. Sometime, I would think, around 1997 seeing as at that point they should have already known they’d lost/mislaid the pre 1995 cloud cover bits. Instead they left this rats nest of code and data sitting for another 10 years making it much much harder to recover. They also got another scientist (the unfortunate Harry) to do the work of fixing it instead of getting a programmer who could do it properly.

  24. seagull
    Posted Nov 23, 2009 at 6:18 PM | Permalink

    Poor Harry complained about “the hopeless state of our databases.”
    Like Francis T and John Wright, above, I wonder if the databases are so disorganised and corrupted by old, error-ridden code that the findings originally derived from them can no longer be retrieved.

    Poor Harry may have been set to work to prepare these files for FOI.
    CRU team would then use secondary, derived data for their publications.

    This would give PJ rational grounds to refuse Steve and others the original data sets, but it would be embarrassing for them to admit this.

  25. Frank K.
    Posted Nov 23, 2009 at 7:30 PM | Permalink

    Jack:

    “It’s beyond ridiculous to take some random person’s personal work diary and hold it up as an example showing why an entire body of work coming from decades of study by thousands of individuals is a hoax. This is absurd.”

    No, it’s not ridiculous. This kind of coding and documentation practice appears to be standard procedure within the climate science community. As further evidence I give you:

    GISTEMP:

    http://data.giss.nasa.gov/gistemp/sources/

    For more on how really bad the whole GISTEMP code and data really are, please go here:

    http://chiefio.wordpress.com/

    GISS MODEL E:

    http://www.giss.nasa.gov/tools/modelE/modelEsrc/

    You get bonus points if you can ever figure out exactly what differential equations MODEL E is trying to solve. No one seems to know…and no one seems to care.

    By the way, GSFC (the government) recently invested 5 – 6 million dollars in *** US stimulus money *** so that GISS could run MODEL E for the IPCC AR5…

    http://www.nasa.gov/topics/earth/features/climate_computing.html

    • Posted Nov 25, 2009 at 8:42 AM | Permalink

      Here are a few looks into the NASA / GISS ModelE code.

      /2009/01/10/yet-even-more-nasagiss-modele-coding/

      /2008/10/31/pattern-matching-in-gissnasa-modele-coding/

      /2008/01/07/another-nasagiss-modele-code-fragment/

      /2007/08/09/coding-guidelines-and-inline-documentation-giss-modele/

      /2006/12/11/a-giss-modele-code-fragment/

      You’ll have to prepend http://edaniel.wordpress.com to each of those.

      All of the internal URL links at that site are broken at the present time.

  26. Paul Linsay
    Posted Nov 23, 2009 at 8:40 PM | Permalink

    Several gems from part 28.

    “With huge reluctance, I have dived into ‘anomdtb’ – and already I have that familiar Twilight Zone sensation.”

    “So having tested to ensure that the first of the pair hasn’t already been used – we then use it!”

    “In fact, I must conclude that an inquiring mind is a very dangerous thing”

  27. Jonathan Dumas
    Posted Nov 23, 2009 at 11:01 PM | Permalink

    Did this mail receive attention?

    http://www.eastangliaemails.com/emails.php?eid=302&filename=1047503776.txt

    From: Tim Osborn
    To: “Michael E. Mann” ,Tom Crowley , Phil Jones
    Subject: Re: Fwd: Soon & Baliunas
    Date: Wed, 12 Mar 2003 16:16:16 +0000
    Cc: Malcolm Hughes ,rbradley@xxxxxxxxx.xxx, mhughes@xxxxxxxxx.xxx,srutherford@xxxxxxxxx.xxx,k.briffa@xxxxxxxxx.xxx, mann@xxxxxxxxx.xxx

    This is an excellent idea, Mike, IN PRINCIPLE at least. In practise,
    however, it raises some interesting results (as I have found when
    attempting this myself) that may be difficult to avoid getting bogged down
    with discussing.

    The attached .pdf figure shows an example of what I have produced (NB.
    please don’t circulate this further, as it is from work that is currently
    being finished off – however, I’m happy to use it here to illustrate my point).

    I took 7 reconstructions and re-calibrated them over a common period and
    against an observed target series (in this case, land-only, Apr-Sep, >20N –
    BUT I GET SIMILAR RESULTS WITH OTHER CHOICES, and this re-calibration stage
    is not critical). You will have seen figures similar to this in stuff
    Keith and I have published. See the coloured lines in the attached figure.

    In this example I then simply took an unweighted average of the calibrated
    series, but the weighted average obtained via an EOF approach can give
    similar results. The average is shown by the thin black line (I’ve ignored
    the potential problems of series covering different periods). This was all
    done with raw, unsmoothed data, even though 30-yr smoothed curves are
    plotted in the figure.

    The thick black line is what I get when I re-calibrate the average record
    against my target observed series. THIS IS THE IMPORTANT BIT. The
    *re-calibrated* mean of the reconstructions is nowhere near the mean of the
    reconstructions. It has enhanced variability, because averaging the
    reconstructions results in a redder time series (there is less common
    variance between the reconstructions at the higher frequencies compared
    with the lower frequencies, so the former averages out to leave a smoother
    curve) and the re-calibration is then more of a case of fitting a trend
    (over my calibration period 1881-1960) to the observed trend. This results
    in enhanced variability, but also enhanced uncertainty (not shown here) due
    to fewer effective degrees of freedom during calibration.

    Obviously there are questions about observed target series, which series to
    include/exclude etc., but the same issue will arise regardless: the
    analysis will not likely lie near to the middle of the cloud of published
    series and explaining the reasons behind this etc. will obscure the message
    of a short EOS piece.

    It is, of course, interesting – not least for the comparison with
    borehole-based estimates – but that is for a separate paper, I think.

    My suggestion would be to stick with one of these options:
    (i) a single example reconstruction;
    (ii) a plot of a cloud of reconstructions;
    (iii) a plot of the “envelope” containing the cloud of reconstructions
    (perhaps also the envelope would encompass their uncertainty estimates),
    but without showing the individual reconstruction best guesses.

    How many votes for each?

    Cheers

    Tim

    At 15:32 12/03/03, Michael E. Mann wrote:
    >p.s. The idea of both a representative time-slice spatial plot emphasizing
    >the spatial variability of e.g. the MWP or LIA, and an EOF analysis of all
    >the records is a great idea. I’d like to suggest a small modification of
    >the latter:
    >
    >I would suggest we show 2 curves, representing the 1st PC of two different
    >groups, one of empirical reconstructions, the other of model simulations,
    >rather than just one in the time plot.
    >
    >Group #1 could include:
    >
    >1) Crowley & Lowery
    >2) Mann et al 1999
    >3) Bradley and Jones 1995
    >4) Jones et al, 1998
    >5) Briffa et al 200X? [Keith/Tim to provide their preferred MXD
    >reconstruction]
    >6) Esper et al [yes, no?–one series that differs from the others won’t
    >make much of a difference]

    >
    >I would suggest we scale the resulting PC to the CRU 1856-1960 annual
    >Northern Hemisphere mean instrumental record, which should overlap w/ all
    >of the series, and which pre-dates the MXD decline issue…

    >
    >Group #2 would include various model simulations using different forcings,
    >and with slightly different sensitivities. This could include 6 or so
    >simulation results:
    >
    >1) 3 series from Crowley (2000) [based on different solar/volcanic
    >reconstructions],
    >2) 2 series from Gerber et al (Bern modeling group result) [based on
    >different assumed sensitivities]
    >1) Bauer et al series (Claussen group EMIC result) [includes 19th/20th
    >century land use changes as a forcing].
    >
    >I would suggest that the model’s 20th century mean is aligned with the
    >20th century instrumental N.Hem mean for comparison (since this is when we
    >know the forcings best).
    >
    >
    >I’d like to nominate Scott R. as the collector of the time series and the
    >performer of the EOF analyses, scaling, and plotting, since Scott already
    >has many of the series and many of the appropriate analysis and plotting
    >tools set up to do this.
    >
    >We could each send our preferred versions of our respective time series to
    >Scott as an ascii attachment, etc.
    >
    >thoughts, comments?
    >
    >thanks,
    >
    >mike
    >
    >At 10:08 AM 3/12/2003 -0500, Michael E. Mann wrote:
    >>Thanks Tom,
    >>
    >>Either would be good, but Eos is an especially good idea. Both Ellen M-T
    >>and Keith Alverson are on the editorial board there, so I think there
    >>would be some receptiveness to such a submission.t
    >>
    >>I see this as complementary to other pieces that we have written or are
    >>currently writing (e.g. a review that Ray, Malcolm, and Henry Diaz are
    >>doing for Science on the MWP) and this should proceed entirely
    >>independently of that.
    >>
    >>If there is group interest in taking this tack, I’d be happy to contact
    >>Ellen/Keith about the potential interest in Eos, or I’d be happy to let
    >>Tom or Phil to take the lead too…
    >>
    >>Comments?
    >>
    >>mike
    >>
    >>At 09:15 AM 3/12/2003 -0500, Tom Crowley wrote:
    >>>>
    >>>
    >>>
    >>>
    >>>
    >>>
    >>>
    >>>
    >>>Phil et al,
    >>>
    >>>I suggest either BAMS or Eos – the latter would probably be better
    >>>because it is shorter, quicker, has a wide distribution, and all the
    >>>points that need to be made have been made before.
    >>>
    >>>rather than dwelling on Soon and Baliunas I think the message should be
    >>>pointedly made against all of the standard claptrap being dredged up.
    >>>
    >>>I suggest two figures- one on time series and another showing the
    >>>spatial array of temperatures at one point in the Middle Ages. I
    >>>produced a few of those for the Ambio paper but already have one ready
    >>>for the Greenland settlement period 965-995 showing the regional nature
    >>>of the warmth in that figure. we could add a few new sites to it, but
    >>>if people think otherwise we could of course go in some other direction.
    >>>
    >>>rather than getting into the delicate question of which paleo
    >>>reconstruction to use I suggest that we show a time series that is an
    >>>eof of the different reconstructions – one that emphasizes the
    >>>commonality of the message.
    >>>
    >>>Tom
    >>>
    >>>
    >>>>Dear All,
    >>>> I agree with all the points being made and the multi-authored
    >>>> article would be a good idea,
    >>>> but how do we go about not letting it get buried somewhere. Can we
    >>>> not address the
    >>>> misconceptions by finally coming up with definitive dates for the LIA
    >>>> and MWP
    and
    >>>> redefining what we think the terms really mean? With all of us and
    >>>> more on the paper, it should
    >>>> carry a lot of weight. In a way we will be setting the agenda for
    >>>> what should be being done
    >>>> over the next few years.

    >>>> We do want a reputable journal but is The Holocene the right
    >>>> vehicle. It is probably the
    >>>> best of its class of journals out there. Mike and I were asked to
    >>>> write an article for the EGS
    >>>> journal of Surveys of Geophysics. You’ve not heard of this – few
    >>>> have, so we declined. However,
    >>>> it got me thinking that we could try for Reviews of Geophysics. Need
    >>>> to contact the editorial
    >>>> board to see if this might be possible. Just a thought, but it
    >>>> certainly has a high profile.
    >>>> What we want to write is NOT the scholarly review a la Jean Grove
    >>>> (bless her soul) that
    >>>> just reviews but doesn’t come to anything firm. We want a critical
    >>>> review that enables
    >>>> agendas to be set
    . Ray’s recent multi-authored piece goes a lot of
    >>>> the way so we need
    >>>> to build on this.
    >>>>
    >>>> Cheers
    >>>> Phil
    >>>>
    >>>>
    >>>>
    >>>>At 12:55 11/03/03 -0500, Michael E. Mann wrote:
    >>>>>HI Malcolm,
    >>>>>
    >>>>>Thanks for the feedback–I largely concur. I do, though, think there
    >>>>>is a particular problem with “Climate Research”. This is where my
    >>>>>colleague Pat Michaels now publishes exclusively, and his two closest
    >>>>>colleagues are on the editorial board and review editor board. So I
    >>>>>promise you, we’ll see more of this there, and I personally think
    >>>>>there *is* a bigger problem with the “messenger” in this case…
    >>>>>
    >>>>>But the Soon and Baliunas paper is its own, separate issue too. I too
    >>>>>like Tom’s latter idea, of a more hefty multi-authored piece in an
    >>>>>appropriate journal (Paleoceanography? Holocene?) that seeks to
    >>>>>correct a number of misconceptions out there, perhaps using Baliunas
    >>>>>and Soon as a case study (‘poster child’?), but taking on a slightly
    >>>>>greater territory too.
    >>>>>
    >>>>>Question is, who would take the lead role. I *know* we’re all very busy,
    >>>>>
    >>>>>mike
    >>>>>

    (…)

  28. Jonathan Dumas
    Posted Nov 23, 2009 at 11:13 PM | Permalink

    and this one?

    http://www.eastangliaemails.com/emails.php?eid=498&filename=1109021312.txt

    From: Phil Jones
    To: mann@xxxxxxxxx.xxx
    Subject: Fwd: CCNet: PRESSURE GROWING ON CONTROVERSIAL RESEARCHER TO DISCLOSE SECRET DATA
    Date: Mon Feb 21 16:28:32 2005
    Cc: “raymond s. bradley” , “Malcolm Hughes”

    Mike, Ray and Malcolm,
    The skeptics seem to be building up a head of steam here ! Maybe we can use
    this to our advantage to get the series updated !
    Odd idea to update the proxies with satellite estimates of the lower troposphere
    rather than surface data !. Odder still that they don’t realise that Moberg et al used the
    Jones and Moberg updated series !
    Francis Zwiers is till onside. He said that PC1s produce hockey sticks. He stressed
    that the late 20th century is the warmest of the millennium, but Regaldo didn’t bother
    with that. Also ignored Francis’ comment about all the other series looking similar
    to MBH.
    The IPCC comes in for a lot of stick.
    Leave it to you to delete as appropriate !
    Cheers
    Phil
    PS I’m getting hassled by a couple of people to release the CRU station temperature data.
    Don’t any of you three tell anybody that the UK has a Freedom of Information Act !

  29. DeNihilist
    Posted Nov 24, 2009 at 1:52 AM | Permalink

    right, so if the science is settled, then why is there ever more money going into research for settled science?

    Time for these boys and girls to move onto unsettled science, like maybe the mating habits of european butterflies…..

  30. helvio
    Posted Nov 24, 2009 at 11:07 AM | Permalink

    One day we’ll see ClimateGate in Hollywood. “The Informant! 2”

  31. Mark
    Posted Nov 24, 2009 at 12:39 PM | Permalink

    Here are more gems from the program files:

    From the programming file called “briffa_sep98_d.pro”:

    yyy=reform(compmxd(*,2,1))
    ;mknormal,yyy,timey,refperiod=[1881,1940]
    ;
    ; Apply a VERY ARTIFICAL correction for decline!!
    ;
    yrloc=[1400,findgen(19)*5.+1904]
    valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,$
    2.6,2.6,2.6]*0.75 ; fudge factor
    if n_elements(yrloc) ne n_elements(valadj) then message,’Oooops!’
    ;
    yearlyadj=interpol(valadj,yrloc,timey)
    ;

    November 24, 2009 | mark
    From the programming file “combined_wavelet.pro”:

    restore,filename=’combtemp’+regtit+’_calibrated.idlsave’
    ;
    ; Remove missing data from start & end (end in 1960 due to decline)
    ;
    kl=where((yrmxd ge 1402) and (yrmxd le 1960),n)
    sst=prednh(kl)

    November 24, 2009 | mark
    From the programming file “testeof.pro”:

    ; Computes EOFs of infilled calibrated MXD gridded dataset.
    ; Can use corrected or uncorrected MXD data (i.e., corrected for the decline).
    ; Do not usually rotate, since this loses the common volcanic and global
    ; warming signal, and results in regional-mean series instead.
    ; Generally use the correlation matrix EOFs.
    ;

    November 24, 2009 | mark
    From the programming file: “pl_decline.pro”:

    ;
    ; Now apply a completely artificial adjustment for the decline
    ; (only where coefficient is positive!)
    ;
    tfac=declinets-cval

    November 24, 2009 | mark
    From the programming file “olat_stp_modes.pro”:

    ;***TEMPORARY REPLACEMENT OF TIME SERIES BY RANDOM NOISE!
    ; nele=n_elements(onets)
    ; onets=randomn(seed,nele)
    ; for iele = 1 , nele-1 do onets(iele)=onets(iele)+0.35*onets(iele-1)
    ;***END
    mknormal,onets,pctime,refperiod=[1922,1995]
    if ivar eq 0 then begin
    if iretain eq 0 then modets=fltarr(mxdnyr,nretain)
    modets(*,iretain)=onets(*)
    endif
    ;
    ; Leading mode is contaminated by decline, so pre-filter it (but not
    ; the gridded datasets!)
    ;

    November 24, 2009 | mark
    From the programming file “data4alps.pro”:

    printf,1,’IMPORTANT NOTE:’
    printf,1,’The data after 1960 should not be used. The tree-ring density’
    printf,1,’records tend to show a decline after 1960 relative to the summer’
    printf,1,’temperature in many high-latitude locations. In this data set’
    printf,1,’this “decline” has been artificially removed in an ad-hoc way, and’
    printf,1,’this means that data after 1960 no longer represent tree-ring
    printf,1,’density variations, but have been modified to look more like the
    printf,1,’observed temperatures.’
    ;

    November 24, 2009 | mark
    From the programming file “mxd_pcr_localtemp.pro”

    ;
    ; Tries to reconstruct Apr-Sep temperatures, on a box-by-box basis, from the
    ; EOFs of the MXD data set. This is PCR, although PCs are used as predictors
    ; but not as predictands. This PCR-infilling must be done for a number of
    ; periods, with different EOFs for each period (due to different spatial
    ; coverage). *BUT* don’t do special PCR for the modern period (post-1976),
    ; since they won’t be used due to the decline/correction problem.
    ; Certain boxes that appear to reconstruct well are “manually” removed because
    ; they are isolated and away from any trees.
    ;

    November 24, 2009 | mark
    From the programming file “calibrate_mxd.pro”:

    ;
    ; Due to the decline, all time series are first high-pass filter with a
    ; 40-yr filter, although the calibration equation is then applied to raw
    ; data.
    ;

    November 24, 2009 | mark
    From the programming file “calibrate_correctmxd.pro”:

    ; We have previously (calibrate_mxd.pro) calibrated the high-pass filtered
    ; MXD over 1911-1990, applied the calibration to unfiltered MXD data (which
    ; gives a zero mean over 1881-1960) after extending the calibration to boxes
    ; without temperature data (pl_calibmxd1.pro). We have identified and
    ; artificially removed (i.e. corrected) the decline in this calibrated
    ; data set. We now recalibrate this corrected calibrated dataset against
    ; the unfiltered 1911-1990 temperature data, and apply the same calibration
    ; to the corrected and uncorrected calibrated MXD data.

    November 24, 2009 | mark
    From the programming file “mxdgrid2ascii.pro”:

    printf,1,’NOTE: recent decline in tree-ring density has been ARTIFICIALLY’
    printf,1,’REMOVED to facilitate calibration. THEREFORE, post-1960 values’
    printf,1,’will be much closer to observed temperatures then they should be,’
    printf,1,’which will incorrectly imply the reconstruction is more skilful’
    printf,1,’than it actually is. See Osborn et al. (2004).’
    printf,1
    printf,1,’Osborn TJ, Briffa KR, Schweingruber FH and Jones PD (2004)’
    printf,1,’Annually resolved patterns of summer temperature over the Northern’
    printf,1,’Hemisphere since AD 1400 from a tree-ring-density network.’
    printf,1,’Submitted to Global and Planetary Change.’
    ;

    November 24, 2009 | mark
    From the programming file “maps24.pro”:

    ;
    ; Plots 24 yearly maps of calibrated (PCR-infilled or not) MXD reconstructions
    ; of growing season temperatures. Uses “corrected” MXD – but shouldn’t usually
    ; plot past 1960 because these will be artificially adjusted to look closer to
    ; the real temperatures.
    ;
    if n_elements(yrstart) eq 0 then yrstart=1800
    if n_elements(doinfill) eq 0 then doinfill=0
    if yrstart gt 1937 then message,’Plotting into the decline period!’
    ;
    ; Now prepare for plotting
    ;

    November 24, 2009 | mark
    From the programming file “calibrate_correctmxd.pro”:

    ;
    ; Now verify on a grid-box basis
    ; No need to verify the correct and uncorrected versions, since these
    ; should be identical prior to 1920 or 1930 or whenever the decline
    ; was corrected onwards from.
    ;

    November 24, 2009 | mark
    From the programming file “recon1.pro”:

    ;
    ; Computes regressions on full, high and low pass MEAN timeseries of MXD
    ; anomalies against full NH temperatures.
    ;
    ; Specify period over which to compute the regressions (stop in 1940 to avoid
    ; the decline
    ;
    perst=1881.
    peren=1960.
    ;

    November 24, 2009 | mark
    From the programming file “calibrate_nhrecon.pro”:

    ;
    ; Calibrates, usually via regression, various NH and quasi-NH records
    ; against NH or quasi-NH seasonal or annual temperatures.
    ;
    ; Specify period over which to compute the regressions (stop in 1960 to avoid
    ; the decline that affects tree-ring density records)
    ;
    perst=1881.
    peren=1960.

    November 24, 2009 | mark
    From the programming file “briffa_sep98_e.pro”:

    ;
    ; PLOTS ‘ALL’ REGION MXD timeseries from age banded and from hugershoff
    ; standardised datasets.
    ; Reads Harry’s regional timeseries and outputs the 1600-1992 portion
    ; with missing values set appropriately. Uses mxd, and just the
    ; “all band” timeseries
    ;****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********
    ;

  32. Corey
    Posted Nov 24, 2009 at 1:33 PM | Permalink

    We now recalibrate this corrected calibrated dataset against
    ; the unfiltered 1911-1990 temperature data, and apply the same calibration
    ; to the corrected and uncorrected calibrated MXD data.

    Uh……huh?

  33. Ian
    Posted Nov 24, 2009 at 2:27 PM | Permalink

    Here’s a little gem from Harry:

    “Back to the gridding. I am seriously worried that our flagship gridded data product is produced by
    Delaunay triangulation – apparently linear as well. As far as I can see, this renders the station
    counts totally meaningless. It also means that we cannot say exactly how the gridded data is arrived
    at from a statistical perspective – since we’re using an off-the-shelf product that isn’t documented
    sufficiently to say that. Why this wasn’t coded up in Fortran I don’t know – time pressures perhaps?
    Was too much effort expended on homogenisation, that there wasn’t enough time to write a gridding
    procedure? Of course, it’s too late for me to fix it too. Meh.”

    So much for their “flagship” product. It looks like it might just leak a little too much to float.

    Cheers.

    • mark
      Posted Nov 24, 2009 at 3:08 PM | Permalink

      snip – please don’t editorialize about policy

  34. Peter
    Posted Nov 24, 2009 at 3:33 PM | Permalink

    Steve,

    I posted a similar comment on WUWT

    I’d be happy to volunteer to analyze/translate/comment/document/clean up some of the source code files, if enough people similarly volunteer to make manageable what would otherwise be a daunting task for a handful of people. Perhaps, if there’s enough of us, we can get through the whole lot in a reasonably short time.
    I’m an experienced C programmer, but I can sort of find my way around Fortran and other languages.
    Does anyone know how to set up a reasonably secure online repository which would serve this purpose?

  35. Mark
    Posted Nov 24, 2009 at 9:03 PM | Permalink

    Lets take a look at exactly how the FORTRAN (glad to see someone still appreciates the original compiled programming language) programs briffa_sep98_e.pro & briffa_sep98_d.pro

    “****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********”

    my comments are within [[[ ]]], otherwise the code snippets are as they
    appear in the file.

    ;
    ; PLOTS ‘ALL’ REGION MXD timeseries from age banded and from hugershoff
    ; standardised datasets.
    ; Reads Harry’s regional timeseries and outputs the 1600-1992 portion
    ; with missing values set appropriately. Uses mxd, and just the
    ; “all band” timeseries
    ;****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********
    ;
    yrloc=[1400,findgen(19)*5.+1904]

    [[[ this creates 20 consequtive 5 year subsets (possibly averaged) of the tree ring data by date starting in year 1904 ]]]

    valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,
    2.6,2.6,2.6]*0.75 ; fudge factor

    [[[ these are the 20 different “fudge factor(s)”-the programmer’s words not mine – to be applied to the 20 different subsets of data, so here are those fudge factors with the corresponding years for the 20 consequtive 5 year periods:

    Year Fudge Factor
    1904 0
    1909 0
    1914 0
    1919 0
    1924 0
    1929 -0.1
    1934 -0.25
    1939 -0.3
    1944 0
    1949 -0.1
    1954 0.3
    1959 0.8
    1964 1.2
    1969 1.7
    1974 2.5
    1979 2.6
    1984 2.6
    1989 2.6
    1994 2.6
    1999 2.6

    a little further down the program adjusts the 20 datasets with the corresponding fudge factors: ]]]

    ;
    ; APPLY ARTIFICIAL CORRECTION
    ;
    yearlyadj=interpol(valadj,yrloc,x)
    densall=densall+yearlyadj

    [[[ So, we leave the data alone from 1904-1928, adjust downward a bit for 1929-1943 (different bits), leave the same for 1944-1948, adjust down a little more for 1949-1953, and then, whoa, start an exponential fudge upward (guess that would be the “VERY ARTIFICIAL CORRECTION FOR DECLINE” noted by the programmer). Might this result in data which don’t show the desired trend or god forbid show a global temperature “DECLINE” after “VERY ARTIFICIAL CORRECTION” turn into a hockey schtick – I mean stick ? and “HIDE THE DECLINE”? You bet it would!

    • Ace
      Posted Nov 25, 2009 at 7:49 AM | Permalink

      Hi Mark

      Do you not think that is the programmer so obviously admits
      “****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********”

      Then he isn’t trying to cover anything up.

      Do you even know what this program was used for? Perhaps this code represents an exercise where the programmer tries to generate a graph that would show ideal results, so that this can be compared to real data in some other program. Perhaps there is a real problem with tree ring data and he is simply trying to figure out what could cause that offset in order to correct for it?

      I doubt he would label an array fudge factor if it was real data meant for submission to some scientific journal. Unless of course, he was just evil and intentionally trying to deceive the world for his own mallicious purposes.

      But we don’t know what his intention was do we? Neither do we know exactly how this code ended up being used. If it was even used at all.

      • Shona
        Posted Nov 26, 2009 at 2:33 AM | Permalink

        Look at UEA and Phil’s reaction. Silence. If it were some student’s U grade homework do you not think they would be shouting that fact from the rooftops?

        This has totally wrecked their reputation.

    • Mark
      Posted Nov 26, 2009 at 3:09 PM | Permalink

      corrections to my post:

      the program data sets actually appear to start in 1400 with the first containing all data 1400-1903, so here are the 19 remaining consecutive data subsets by first year, with the “fudge factors” and the “fudge factors”*.75 which is really what was used:

      Year Fudge Factor Fudge Factor*.75
      1400 0 0
      1904 0 0
      1909 0 0
      1914 0 0
      1919 0 0
      1924 -0.1 -0.075
      1929 -0.25 -0.1875
      1934 -0.3 -0.225
      1939 0 0
      1944 -0.1 -0.075
      1949 0.3 0.225
      1954 0.8 0.6
      1959 1.2 0.9
      1964 1.7 1.275
      1969 2.5 1.875
      1974 2.6 1.95
      1979 2.6 1.95
      1984 2.6 1.95
      1989 2.6 1.95
      1994 2.6 1.95

      Secondly, this code some say is IDL not FORTRAN
      To Ace, clearly this code was not meant for others to see, and why it was not released with FOIA requests(I assume). It appears – do I know for sure – no how could I? – but appears to match the content of the papers:

      ◦Osborn, T.J., Briffa, K.R., 2000. Revisiting timescale-dependent reconstruction of climate from tree-ring chronologies. Dendrochronologia 18, 9-26
      ◦Annually resolved patterns of summer temperature over the Northern Hemisphere since AD 1400 from a tree-ring-density network Timothy J. Osborn, Keith R. Briffa,Fritz H. Schweingruber, Phil D. Jones.

      Note one program is dated 3/4/99 just prior to the 2000 paper and is called “briffa_sep98_d.pro” so might have been authored by or involved Briffa’s data. The latest paper states “To overcome these problems, the decline is artificially removed from the calibrated tree-ring density series, for the purpose of making a final calibration. The removal is only temporary, because the final calibration is then applied to the unadjusted data set (i.e., without the decline artificially removed). Though this is rather an ad hoc approach, it does allow us to test the sensitivity of the calibration to time scale, and it also yields a reconstruction whose mean level is much less sensitive to the choice of calibration period”

      The questions I have regarding this “ad hoc” approach are:

      1. why didn’t the papers reveal the “fudge factor”(s)- not my words the programmer’s- used to produce the calibration of tree ring density? They happen to show an almost exponential rise post 1939. I find no justification of this.
      2. why do said “fudge factor”(s)-there are twenty to be applied to different time periods – increase some years, decrease others, and leave others the same whereas the papers say in effect only increase because of a recent decline – and why is that even permissible?
      2. why is it permissible to apply a “VERY ARTIFICIAL” adjustment to the “DECLINE” because of the “problems otherwise induced by” the “recent decline in high latitude tree-ring density?”
      3. Although the 2 papers only mention “adjusting” the data for purposes of obtaining a “temporary” “calibration”, is it not true that when a “VERY ARTIFICIAL” adjustment is used to create a “VERY ARTIFICIAL” calibration, and this “VERY ARTIFICIAL” calibration is then applied to the raw data, what you end up with is “VERY ARTIFICIAL” data?
      4. Why does the code I cited proceed to plot the “fudged” data with the “VERY ARTIFICIAL” adjustments and does not plot the raw data with the “fudged” calibration “without the decline artificially removed”(as the paper states)? If you are only using the “fudged” data for calibration purposes, why plot it?
      5. The programming files also use the words “VERY ARTIFICIAL…” , but the papers make this sound so routine and don’t use that adjective. Why use the word VERY unless you are implying “too much?”

  36. Ian
    Posted Nov 25, 2009 at 2:38 AM | Permalink

    This is fun (from Harry) [dealing with Australian data]

    “I am at a bit of a loss. It will take a very long time to resolve each of these ‘rogue’
    stations. Time I do not have. The only pragmatic thing to do is to dump any stations that are
    too recent to have normals. They will not, after all, be contributing to the output. So I
    knocked out ‘goodnorm.for’, which simply uses the presence of a valid normals line to sort.
    The results were pretty scary:

    Stations retained: 5026
    Stations removed: 9283

    Essentially, two thirds of the stations have no normals! Of course, this still leaves us with
    a lot more stations than we had for tmean (goodnorm reported 3316 saved, 1749 deleted) though still far behind precipitation (goodnorm reported 7910 saved, 8027 deleted).

    I suspect the high percentage lost reflects the influx of modern Australian data. Indeed, nearly 3,000 of the 3,500-odd stations with missing WMO codes were excluded by this operation. This means that, for tmn.0702091139.dtb, 1240 Australian stations were lost, leaving only 278.”

    Now, he did feel this was extreme and wanted to find a way to fix it. Wasn’t clear (to me) whether he managed to fix the problem, as he seemed to get distracted by one or more of the other numerous problems.

  37. Posted Nov 25, 2009 at 4:58 AM | Permalink

    More CRU code / data thoughts – http://www.di2.nu/200911/25.htm

  38. Indigo
    Posted Nov 25, 2009 at 6:13 AM | Permalink

    I feel very very sorry for Harry. His bosses have exploited him. I hope he is not having a nervous breakdown somewhere.

  39. CarlGullans
    Posted Nov 25, 2009 at 10:33 AM | Permalink

    I found this towards the end of the final text file:

    “This time around, (dedupedb.for), I took as simple an approach as possible – and almost immediately hit a problem that’s generic but which doesn’t seem to get much attention: what’s the minimum n for a reliable standard deviation?

    I wrote a quick Matlab proglet, stdevtest2.m, which takes a 12-column matrix of values and, for each month, calculates standard deviations using sliding windows of increasing size – finishing with the whole vector and what’s taken to be *the* standard deviation.

    The results are depressing. For Paris, with 237 years, +/- 20% of the real value was possible with even 40 values. Windter months were more variable than Summer ones of course. What we really need, and I don’t think it’ll happen of course, is a set of metrics (by latitude band perhaps) so that we have a broad measure of the acceptable minimum value count for a given month and location. Even better, a confidence figure that allowed the actual standard deviation comparison to be made with a looseness proportional to the sample size.

    All that’s beyond me – statistically and in terms of time. I’m going to have to say ’30’.. it’s pretty good apart from DJF. For the one station I’ve looked at.”

    It appears to me that the programmer wants to make nonsensical limitations upon the standard deviation of the data… while this guy appears to deal somewhat well with arcane and highly confusing code, he randomly will make statements like this where (and he admits this) he has no idea, statistically, what he is doing.

  40. CarlGullans
    Posted Nov 25, 2009 at 10:42 AM | Permalink

    Also, I read that all precipitation data from 1990 onwards is apparently synthetic (extrapolated/predicted)… harry’s comments:

    ARGH. Just went back to check on synthetic production. Apparently – I have no memory of this at all – we’re not doing observed rain days! It’s all synthetic from 1990 onwards. So I’m going to need
    conditionals in the update program to handle that. And separate gridding before 1989. And what TF happens to station counts?

    OH FUCK THIS. It’s Sunday evening, I’ve worked all weekend, and just when I thought it was done I’m hitting yet another problem that’s based on the hopeless state of our databases. There is no uniform
    data integrity, it’s just a catalogue of issues that continues to grow as they’re found.

  41. Anon
    Posted Nov 25, 2009 at 12:14 PM | Permalink

    From “The people -vs- the CRU: Freedom of information, my okole” Clearly the following admission by Phil Jones kills CRU’s whole position about “not having the data”:
    http://wattsupwiththat.com/2009/11/24/the-people-vs-the-cru-freedom-of-information-my-okole…

    “I have had a couple of exchanges with Courtillot. This is the last of them from March 26, 2007. I sent him a number of papers to read. He seems incapable of grasping the concept of spatial degrees of freedom, and how this number can change according to timescale. I also told him where he can get station data at NCDC and GISS (as I took a decision ages ago not to release our station data, mainly because of McIntyre). I told him all this as well when we met at a meeting of the French Academy in early March.”

    However, it seems to me the HARRY_READ_ME.txt is also critical in the debate about reproducing CRU’s results for a reason I’ve not seen picked up on to date.

    Look at the first line:

    “READ ME for Harry’s work on the CRU TS2.1/3.0 datasets, 2006-2009!”

    Now other comments in the file make it clear that “HARRY” was trying to recreate CRU’s published results FOR THREE YEARS AND FAILED.

    Do you see the REAL significance of this because it is absolutely fatal to the credibility of anything CRU has produced.

    What we have here is a documented THREE year effort by a CRU programmer, who had access to all the data, access to all the code, access to all the people who developed the code and the models and still HE could still NOT duplicate CRU’s OWN results. If he can’t it simply means the CRU’s results cannot be reproduced even by themselves and so there is no point anyone else even trying — CRU themselves have proven it’s a waste of time and so they themselves have proven their own results are plain rubbish.

    A very nice layman’s summary of some of the issues in the HARRY_READ_ME.txt can eb found here

    http://www.devilskitchen.me.uk/2009/11/data-horribilis-harryreadmetxt-file.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+TheDevilsKitchen+%28The+Devil%27s+Kitchen%29&utm_content=Netvibes

  42. Pat Keating
    Posted Nov 25, 2009 at 12:32 PM | Permalink

    From Mark’s post:
    ……; Plots 24 yearly maps of calibrated (PCR-infilled or not) MXD reconstructions
    ; of growing season temperatures. Uses “corrected” MXD – but shouldn’t usually
    ; plot past 1960 because these will be artificially adjusted to look closer to
    ; the real temperatures.

  43. David L. Hagen
    Posted Nov 25, 2009 at 7:56 PM | Permalink

    Could the CRU “fudge factor” have been applied to NZ data?

    See discussion above: “valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,
    2.6,2.6,2.6]*0.75 ; fudge factor”

    See dramatic difference between the official published NZ data with 0.92 deg/century compared to the raw NZ data with 0.06 deg/century.

    See Are we feeling warmer yet?

    Note particularly:

    Dr Jim Salinger (who no longer works for NIWA) started this graph in the 1980s when he was at CRU (Climate Research Unit at the University of East Anglia, UK) and it has been updated with the most recent data. It’s published on NIWA’s website and in their climate-related publications.

    See detailed analysis in 25 November 2009 Are we feeling warmer yet? (A paper collated by Richard Treadgold, of the Climate Conversation Group, from a combined research project undertaken by members of the Climate Conversation Group and the New Zealand Climate Science Coalition)

  44. Follow the Money
    Posted Nov 25, 2009 at 9:37 PM | Permalink

    I grow to like Harry. A nice guy, sympathetic and philosophical.

    http://di2.nu/foia/HARRY_READ_ME-35q.html

    Luckily, this isn’t really up to me. Or.. is it? If the operator specifies a time periodto update, it ought to warn if it finds earlier updates in those files. So further mods to mcdw2cruauto are required.. its results file must list extras. Or – ooh! How about a SECOND output database for the MCDW updates, containing just the OVERDUE stuff?

    Back.. think.. even more complicated. My head hurts. No, it actually does. And I ought to be on my way home. But look, we create a new master database (for each parameter) every time we update, don’t we? What we ought to do is provide a log file for each new database, identifying which data have been added. Oh, God. OK, let’s go..

  45. Alex Harvey
    Posted Nov 26, 2009 at 1:58 AM | Permalink

    This is all exciting stuff, but who is this Harry, and why did he write this?

    What is this supposed to mean:

    5492 I think this can only be fixed in one of two ways:
    5493
    5494 1. By hand.
    5495
    5496 2. By automatic comparison with other (more reliable) databases.
    5497
    5498 As usual – I’m going with 2. Hold onto your hats.

    “Hold onto your hats?” He is writing this stuff with a view for others to later read it. This is suspicious, frankly. It doesn’t seem professional. I am a computer programmer, and I would never write this sort of stuff in a text file, regardless of how I felt about what I was doing. Someone would be bound to find it, sooner or later.

    It seems to me that he either wants to get caught… or this file is not genuine.

  46. Mark
    Posted Nov 26, 2009 at 3:20 PM | Permalink

    question #4 above should have read:

    4. Why does the code I cited proceed to plot the “fudged” data with the “VERY ARTIFICIAL” adjustments and does not plot the raw data “without the decline artificially removed”(as the paper states)? If you are only using the “fudged” data for calibration purposes, why plot it?

  47. Alex Harvey
    Posted Nov 27, 2009 at 8:48 AM | Permalink

    Dear Steve,

    Despite being skeptical of the science, and on your side in all of this Mann & Jones FOI dispute, I am finding this HARRY_READ_ME.txt file too good to be true.

    The more I read it, the more my gut tells me that the author of the file, or perhaps even the author of some of the comments in it, has in some way or other manufactured this readme file.

    What I really don’t like about this file is that, apparently, on day one, the author was picking holes in really minor things.

    For instance, the same day he’s worked out the two filesystems of interest, he has this to say about the most useful README files:

    (yes, they all have different name formats, and yes, one does begin ‘_’!)

    Why would this guy be being so picky on what is apparently his first day in a new job? It is normal for filesystems to look like this. Even the best of people, as far as I’ve ever been able to observe, generally don’t keep their doco up to date. So it doesn’t make sense to me that he’d be having a go at the naming conventions on readme files on his first day.

    The other possibility is, he later when back and added editorial commentary to the file.

    If so, why?

  48. scottbert
    Posted Nov 30, 2009 at 10:51 AM | Permalink

    When you get a piece of software, you test it with sample data. When you test it with sample data, you try very hard to get the results you know about. You also complain about the software not working and question why your results aren’t doing quite what they’re meant to.

    Which seems to be all that “Harry” is doing in this document…

  49. Dr. Dweeb
    Posted Dec 2, 2009 at 10:53 AM | Permalink

    Re: Alex….

    You are of course joking, or ignorant of how programmers think and function.

    Naming conventions are one of the very first things one looks at, as when they are cocked up, it is likely that the rest will be cocked up as well. Good programming is the result of an ordered structured mind. A casual perusal of the code in question here reveals neither good code, nor an author with a structured mind.

    I feel Harry’s pain, I really do, because I have had to do this sort of job many times and it is a serious PITA.

    DrDweeb

One Trackback

  1. […] McIntyre has done the necessary work, and lots more goodies are coming out of the Hadley CRU readme file, confirming from inside what Steve proved from outside, but it needs to be organized and […]