Would a very simple refresher course in calculus be helpful, e.g., the definition of a derivative and examples of the application of that definition to sines and cosines in order to see how the formulas in the link were derived?

One only needs a trig formulas to do that, namely the sine of the sum of two angles.

Jerry

]]>LaTex is used by most scientists to write their manuscripts so you studied under a very famous Professor.

I used LINUX nroff for many years and the similarity between the two text processors allows for a fairly easy transition. Just need to swap

things like sub and sup for _ and ^. ðŸ™‚

Jerry

]]>I do understand enough to recognize the symbols etc… and I studied under Dr Knuth who invented TeX. But sadly this math is beyond me. (My dad would have much enjoyed all this: his PhD involved working out the mathematics of dendritic growth.)

]]>Thanks for finding the source of the problem.

I now feel that it is best to debug the LaTEX stuff off line and then store it on Google drive so that I am not debugging it on multiple messages. Did you understand the smoothing impact of inverting the Laplacian on the noise in the right hand side (large amplitudes in the high wave numbers of the right hand side)? That explains how the geopotential (essentially pressure) is so smooth in the ECMWF plots even though the vorticity is extremely noisey. And the vorticity is noisy because of the inappropriate use of Richardson’s equation for the vertical velocity (instead of the correct 3 dimensional elliptic equation) and rough forcing.

Jerry

]]>– Surround the LaTex with

`$latex`

(your stuff) `$`

– Be careful with auto-converted characters. One of Dr Browning’s final issues was of N-dash for the minus sign, instead of a plain dash character. Unfortunately, many word processors will make that substitution automagically. Aren’t they so nice ðŸ˜¦

– Spelling of LaTex “commands” matters. It’s \frac not \fract for example. ]]>

The trends were determined using Singular Spectrum Analysis (SSA) with L=15 years and groups 1 and 2 to reconstruct the trend. The heuristics for SSA of all the observed and modeled series showed no significant periodic/cyclical components. Using SSA allows for and handles conveniently non linear trends.

I was striving to estimate statistical significance between the modeled and observed series and modeled versus modeled series. It became apparent that comparing the single measured realization of earth’s temperatures with a modeled result even where the model has multiple runs will not yield to standard frequentist null hypothesis. A model with multiple runs will provide an estimated probability distribution that one can determine where the observed single realization might fit. That is clearly not the same method whereby the mean of the model is compared to an observed mean – since the observed has no mean and further the observed result cannot be located on a probability distribution, i.e. we do not know how close or far the result is from a mean that could only be determined if we had several realizations of the observed as would be analogous to making multiple model runs. An attempt can be made to estimate the distribution of the observed series by finding a decent fit, for example, to an ARMA model. A similar approach can be applied to a model with a single run. Unfortunately that exercise does not allow determining where on that distribution the single realization/run fits or in other words knowing the mean. Using simulations will not produce the necessary information to do standard hypothesis testing.

The best that can be done for the observed to model comparison where the model has multiple runs is to determine where on the probability curve of the model the observed result falls. For the analysis reported here this was accomplished using a t distribution. The standard deviation is used here in determining the distribution and not the standard error of the mean as would be the case in comparing means. Obviously, as the observed result falls to low probability values, the less likely that the observed result is part of the modeled distribution.

A standard null hypothesis test can be applied when comparing two models where both have multiple runs and that is what I did in these analyses.

Comparing the single realized observed result to a model without multiple runs is not amenable to any standard frequentist comparisons other than showing the results in a histogram in order to view what might be deemed extreme values or indicating that the single model runs cannot be used as part of the same distribution. That is the approach I used for the comparison of the observed and model without multiple run series.

The results of these analyses are linked with Dropbox to an Excel file.

The left-most part of the Excel worksheet shows arrays for the temperature series variables with all paired combinations of models with multiple runs. The probabilities listed in the arrays are probabilities of the model means or ratios (in the case of the variances) being zero for the means or 1 for the ratios by chance and thus can be used for rejection of the null hypothesis that the differences are zero or the ratio is 1. The yellow shaded results in the arrays are for rejection at the 5% level and the red text for rejection at the 1% level.

Moving from left to right in the worksheet shows next the arrays of all paired combinations of the 5 observed series variables with those of the models with multiple runs. Here the probabilities are for where on the probability distribution the observed results will fall with less than 5% probability shaded in yellow and less than 1% probability shaded in red.

Moving further to the right one can observe the histograms of the 5 variable results for the models with single runs with the corresponding observed variables shown with identifying colored Xs.

Comparing variable means of models with multiple runs provides the narrowest distributions and a more sensitive method of determining statistically significant differences. Acknowledging those significant differences amongst the models is important in avoiding the practice of using these model results as part of the same distribution in comparisons with observed series variables. Acknowledging these differences on a more objective basis might well also overcome the collegiality attitude that apparently is at work in avoiding judgments on the validity of the individual models or at least which come closest to representing the observed series variables. The most paired model differences are seen in the variance comparisons and then the trend for 1880-2005, followed by the trend for 1970-2005 and then the AR1 coefficient and finally the NH/SH warming ratio.

The comparison of the paired variables for the observed and multiple run model series shows that the most differences are for the 1880-2005 trend and least with 1970-2005 trend with the variables AR1, variance and NH/SH warming bunched together in the middle.

Models with single runs in my estimation are of little value in comparing those results with observed series and with other models. I also judge that there is sufficient evidence that would lead to not using the distribution of single runs for comparisons. I would consider models with only single runs as not serious efforts at being validated in any form or manner and should probably be better ignored or greatly downplayed. The great advantage of modeling over dealing with the earth’s single realization (and the only one we will ever have) is the ability to look at multiple realizations and thus a single run model is a wasted opportunity.

I suppose the results for the paired comparisons of the 5 variables for the models with multiple runs to the observed series could be used in grading the models capability to at least empirically approximate the single global realization, but it would have to be with the understanding that a frequentist approach to rejecting a null hypothesis is not possible. I think that a Bayesian approach that uses some theoretical considerations for constructing a prior probability might serve better in these evaluations and grading.

* The model temperature series is from the air immediately above the ocean surfaces (tas) while the observed series use SST (tos). There is a difference in the modeled temperatures trends from these two sources of temperature with tas being higher than tos and as a result the temperature trends for the models where adjusted downward by 6% for the above analyses and the warming ratios of NH/SH for the models was increased by the factor of 1.023.

Link to Excel file:

https://www.dropbox.com/s/zl71yyojsl5b5ns/RCP45_Observed_Compare.xlsx?dl=0

]]>The ECMWF, NCEP, NCAR. and Australian models all are using similar code for the hydrostatic dynamical equations, namely the pseudo spectral method (Fourier transforms in longitude and Gaussian quadrature in latitude).

A global grid in lat/lon with a finite difference method has problems at the poles because of the singularity of the

Jacobian at the poles (horizontal velocities are multiple valued at poles). The spectral method the models use is suppose to solve that problem (although that is not quite the case – see Browning, Hack, Swarztrauber reference mentioned earlier). As I have shown the ECMWF and NCEP global models both use a semi-implicit method that

incorrectly forces geostrophic balance on the solution. Thus the dynamics of the models are almost identical.

The original NCAR atmospheric model dynamics was obtained from ECMWF.

The difference comes in through the physics (parameterizations). WHen ECMWF gave NCAR the dynamical part of their model, they withheld the physics (big secret). The obvious question is that if the parameterizations are accurately

describing the physics, why are there so many different versions. Does arbitrary tuning come to mind? ðŸ™‚

It should become clear that parameterizations are not perfect and the inaccuracy leads to large errors very quickly

(see Sylvie Gravel’s manuscript). To over come these problems in longer runs, the models are tuned to produce an energy spectrum that looks somewhat realistic, but is physically nonsense.

Jerry

]]>How much code do they share?

are some of the models “offspring” of others?

Are they really all independent of each other?

If one can parametrize any of the models to perform adequately, why dont they? ]]>