Jacoby is on the Hockey Team. His treeline temperature reconstruction was made by picking the 10 most "temperature-influenced" of 36 sites studied. Only these 10 sites were archived. I sought information on the other 26 through Climatic Change, the publishing journal. Jacoby refused, stating:
The inquiry is not asking for the data used in the paper (which is available), they are asking for the data that we did not use.
Imagine this argument in the hands of a drug trial. Let’s suppose that they studied 36 patients and picked the patients with the 10 "best" responses, and then refused to produce data on the other 26 patients on the grounds that they didn’t discuss these other patients in their study. It’s too ridiculous for words. Yet Climatic Change saw no problem with this refusal. Jacoby went on to say that his research was "mission-oriented" and that:
As an ex- marine I refer to the concept of a few good men. A lesser amount of good data is better without a copious amount of poor data stirred in.
Imagine ex-marines with this philosophy in charge of drug trials. Maybe they already are.
Jacoby Response to Data Request
Jacoby and d’Arrigo [Clim. Chg. 1989], together with d’Arrigo and Jacoby , is a temperature reconstruction, which is applied in many multiproxy studies (e.g. Jones et al , 11 series used individually in MBH98, as an "adjustment" to the North American PC1 in MBH99, Jones and Mann ). Jacoby is a member of the Hockey Team.
Jacoby and d’Arrigo  states on page 44 that they sampled 36 northern boreal forest sites within the preceding decade, of which the ten "judged to provide the best record of temperature-influenced tree growth" were selected. No criteria for this judgement are described, and one presumes that they probably picked the 10 most hockey-stick shaped series.
I have done simulations, which indicate that merely selecting the 10 most hockey stick shaped series from 36 red noise series and then averaging them will result in a hockey stick shaped composite, which is more so than the individual series. The process is not dissimilar to what happens in the MBH98 PC1. In the MBH98 PC1, the 14 most hockey stick shaped series account for over 93% of the variance. There is very little difference in appearance between a simple average of these 14 series and the EOF-weighted composite (PC1).
I was interested in testing whether Jacoby’s selection process imparted a bias to the data set under consideration. In order to test whether Jacoby’s selection of the 10 most "temperature-influenced" series had any significance relative to comparable selection from red noise series, I looked for the archived information of the 36 sites. I had previously located the 10 most "temperature-influenced" sites at WDCP archives (and have elsewhere discussed inconsistencies between this archive and the versions used in MBH98), but was unable to locate archived versions of the other 26 series.
As a result of some previous exchanges with Climatic Change (which I will probably discuss on another occasion), they adopted a policy in which authors were required to provide supporting data, but decided not to adopt a policy requiring authors to provide source code. Under this policy, I asked them to obtain the other 26 series from Jacoby (since I had had no success directly).
Jacoby refused to provide the 26 series and I found his reasoning as set out to Climatic Change quite interesting (my bolds).
The inquiry is not asking for the data used in the paper (which is available), they are asking for the data that we did not use. We have received several requests of this sort and I guess it is time to provide a full explanation of our operating system to try to bring the question to closure.
Speaking for myself and immediate colleagues who have been involved with my research: Most of our research has been mission-oriented, dendroclimatic research. That means to find climatically-sensitive, old-aged trees and sample them in order to extend the quantitative record of climatic variations. Also, to relate these records to the real world and investigate the climate system and its functioning.
The first part produces absolutely-dated time series of tree-ring variations. We try to sample trees at sites where there is likely to be a strong climatic signal, usually temperature or precipitation. Sometimes we are successful, sometimes we are not. We compare the tree-ring series to climate records to test what the climate signal is. We sample latitudinal treeline and elevational treeline looking for temperature-sensitive trees with both a high-frequency and low-frequency response to temperature. A high-frequency temperature response to summer is most frequently found at these extreme locations. However, trees have much more information if one finds trees with a good communal high and low frequency variations that correspond or correlate to local or regional temperatures for longer seasons. There is abundant information to explain the physiological processes in cooler seasons and why trees can respond to more than just summer season. The sampling and development of a tree-ring chronology is an investment of research energy, time, and money.
The best efforts in site selection and sampling do not always produce a good chronology. It is only as the samples are processed and analyzed that the quality, or lack thereof becomes evident. First is the dating: this is enabled by high-frequency common variation among the trees. The dating is achieved and tested by various methods. Then the chronology is developed from the correctly dated ring-width measurements and evaluated. Testing: Is there a common low-frequency signal among the trees? At a good temperature- sensitive site with good trees, there is. We conduct common period analyses of the low- frequency variation within the cores samples from a site.
Sometimes, even with our best efforts in the field, there may not be a common low-frequency variation among the cores or trees at a site. This result would mean that the trees are influenced by other factors that interfere with the climate response. There can be fire, insect infestation, wind, or ice storm etc. that disturb the trees. Or there can be ecological factors that influence growth. We try to avoid the problems but sometimes cannot and it is in data processing that the non-climatic disturbances are revealed.
We strive to develop and use the best data possible. The criteria are good common low and high-frequency variation, absence of evidence of disturbance (either observed at the site or in the data), and correspondence or correlation with local or regional temperature. If a chronology does not satisfy these criteria, we do not use it. The quality can be evaluated at various steps in the development process. As we are mission oriented, we do not waste time on further analyses if it is apparent that the resulting chronology would be of inferior quality.
If we get a good climatic story from a chronology, we write a paper using it. That is our funded mission. It does not make sense to expend efforts on marginal or poor data and it is a waste of funding agency and taxpayer dollars. The rejected data are set aside and not archived.
As we progress through the years from one computer medium to another, the unused data may be neglected. Some [researchers] feel that if you gather enough data and n approaches infinity, all noise will cancel out and a true signal will come through. That is not true. I maintain that one should not add data without signal. It only increases error bars and obscures signal.
As an ex- marine I refer to the concept of a few good men.
A lesser amount of good data is better without a copious amount of poor data stirred in. Those who feel that somewhere we have the dead sea scrolls or an apocrypha of good dendroclimatic data that they can discover are doomed to disappointment. There is none. Fifteen years is not a delay. It is a time for poorer quality data to be neglected and not archived. Fortunately our improved skills and experience have brought us to a better recent record than the 10 out of 36. I firmly believe we serve funding agencies and taxpayers better by concentrating on analyses and archiving of good data rather than preservation of poor data.
I guess I won’t be getting the data. It would be my position that, if they picked 10 of 36 sites, they used all 36 sites in their study. Imagine this argument in the hands of a drug trial. Let’s suppose that they studied 36 patients and picked the patients with the 10 best responses, and then refused to produce data on the other 26 patients on the grounds that they didn’t discuss these other patients in their study. It’s too ridiculous.