Have any of you seen any articles discussing which model runs are archived? It doesn’t appear to me that all model runs are archived. So what criteria are used to decide which model runs are archived by the modelers at PCMDI? (This is a different question than IPCC selections from the PCMDI population.) We’re all familiar with cherrypicking bias in Team multiproxy studies e.g. the addiction to bristlecones and Yamal. It would be nice to think that the PCMDI contributors don’t have a corresponding addiction.
Figure 1 below shows the number of 20CEN runs in the Santer collection of 49 20CEN runs. A few models have 5 runs (GISS EH, GISS ER, NCAR CCSM, Japan MRI), but many models only have one run.
PCMDI now has 81 20CEN runs (KNMI – 78), but the distribution has become even more unbalanced with much of the increase coming from further additions to already well-represented models e.g. NCAR CCSM.
Figure 2. KNMI 20CEN Runs (78) by Model
It’s hard to envisage circumstances under which a modeling agency would only have 1 or 2 runs in their portfolio. Modeling agencies with only one 20CEN run include: BCCR BCM2.0, Canadian CGCM 3.1(T63), CNRM CM3, ECHAM4, INM CM3.0, IPSL CM4 and MIROC 3.2 (hi-res). Modeling agencies with only two archived 20CEN runs include the influential HadCM3 and HadGEM1 models. Surely there are other runs lying around? Why are some archived and not others?
The non-archiving impacts things like Santer et al 2008. One of the forms of supposed uncertainty used by Santer to argue against a statistically significant difference between models and observations is autocorrelation uncertainty in the models. While we are limited on the observation side by the fact that we’ve only got one earth to study, a few more available runs of each model would do wonders in reducing the supposed uncertainty in model trends. Santer should probably have 1) thrown out any models for which only one run was archived; 2) written to the modeling agencies asking for more runs; 3) include a critical note against non-archiving agencies in his paper (though I’m led to believe by reviewers that such criticisms would be “unscientific”.)
Here’s another interesting scatter plot illustrating an odd relationship between trend magnitude (for each model) and trend standard deviation (for each model). This is done only for multi-run models – as standard deviation is obviously not defined for singletons.
The above relationship is “significant” in statistical terms. But why should there be a relationship between the mean stratified by model and the standard deviation stratified by model. I’ve had to scratch my head a little to even think up how this might happen. I think that such a relationship could be established by a bias in favor of inclusion of (shall we say) DD trends relative to their less endowed cousins.
Or perhaps there’s some mundane reason that would trouble no one. Unfortunately, IPCC doesn;t seem to have established objective criteria requiring modeling agencies to archive all their results and so, for now, I’m left a bit puzzled. For the record, I’m not alleging such a bias on the present record. But equally it is entirely legitimate to ask what the selection criteria are. Not that I expect to get an answer.