Thanks for the compliment. It was a fairly standard Analysis of Covariance. It would be better to have a larger set of stations (and possibly some changes in the model)if it was to be redone. Yes, it does indicate that there might be some differences due to quality of the stations.

]]>It’s very late on the thread but I have to say thanks for the nice work, I wondered how much these stations were affected by quality. It looks like just sorting them on quality will make a major improvement.

]]>It’s very late on the thread but I have to say thanks for the nice work, I really wondered how much these stations were affected by quality. It looks like just sorting them on quality will get the job done in the future.

]]>Does anyone have any potential explanations or conjectures for the interaction of the CRN station rating with the station population rating? I have been attempting to come up with some of my own, but in the end after some clearer thinking had to reject them. Perhaps some of the Watts team members have some insights into this phenomenon.

Just to remind you of what it is I am talking about here, I am summarizing RomanM’s interaction graphs in #130. The approximate ranges of the trends CRN1-5 for Rural stations is 0.3, for Suburban stations is 0.9 and for Urban stations is 1.5. The order of trends from lowest to highest for CRN1-5 is 2,3,1,4,5 for Rural; 1,4,3,2,5 for Suburban and 1,2,3,4,5 for Urban stations.

What appears to me out of this result is that Urban located stations give a measure of what one might expect from the CRN ratings in both spread and progression from 1 through 5. Suburban located stations appear intermediate between Rural and Urban for the spreads, as one might expect for an interaction, but for both Rural and Suburban stations, the CRN progressions seem to get disrupted.

In general, one might want to view these results as the Rural and Suburban environment having a mitigating effect on the CRN micro site differences, i.e. they are more forgiving of these differences than the Urban environment.

I suppose if we could come up with a reasonable explanation of this effect, we could go back to the regression model with expanded CRN ratings and look at the results, but I will leave that judgment to RomanM.

]]>With respect, what is the goal of your analysis? It seems that the same conclusions were reached a long time ago (perhaps with less conclusive methods). What is the question we are trying to answer here?

Clayton, I have been aware of the Watts teams findings for sometime and some qualitative statements made about the expectations from the ratings, but I have not seen any comprehensive analysis such as Roman presents here. I agree with Roman it does not give a complete or final picture, but its form allows for some interesting and important talking points — in my view anyway.

]]>I would like to publicly thank Roman for the effort that he put into this comprehensive analyis and putting it into a form that can be readily comprehended from his explanations. After communicating by email with Roman, I think I can say that this old dog has learned a new trick or two. I hope the interested parties here will study the data and calculations so that we can have a conversation about the more interesting aspects of it — the population and CRN interaction coming to mind.

After seeing Roman’s methods and having him explain them to me, I wholeheartily agree with his analysis and understand the errors and shortcomings of my more simple-minded approach. I feel more confident of the apparent conclusions now that a statistician with Roman’s credentials has analyzed the data. In the end I would hope when the Watts team reaches a completion point with their CRN evaluations that their representatives and Roman could combine to publish the results. I gave Roman the data from the USHCN Calculated Mean data set for the period 1920-2005, future analyses might want to consider other data sets and time periods — even though I thought I had good a prior reasons for using this data set and time period.

]]>With respect, what is the goal of your analysis? It seems that the same conclusions were reached a long time ago (perhaps with less conclusive methods). What is the question we are trying to answer here?

]]>Ken ran a regression on the station trends using the CRN rating including the variables population, latitude, longitude and elevation (all as numeric) to isolate the effect of CRN on the observed trends. IMHO, treating CRN and population in a linear fashion was hard to justify. Thus, I suggested an analysis of covariance with the variables latitude, longitude and elevation as covariates, but with CRN rating and population as categorical. Ken was kind enough to send me the data he used to do the analysis. For purposes of clarity, I recoded the population variable to A, B, C from the values 1, 2, 3 used by Ken. A is the lowest population level (Rural) and C is the highest (Urban).

The model used was as described above (with interaction between rating and population included). The data set was unbalanced with respect to population. E.g. for a station with CRN rating 1, 35% of the stations were pop A and 40% were C, while for rating 4 stations, 68% of the stations were A and less than 7% C. Comparing simple averages would be like comparing apples and oranges so a more sophisticated statistical methodology is needed to adjust for the effect of population on the trends. The results of the analysis:

Analysis of Variance for Trend, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P

lat 1 41.7836 34.5851 34.5851 67.09 0.000

long 1 20.0465 4.1518 4.1518 8.05 0.005

Elev(ft) 1 0.4995 3.3163 3.3163 6.43 0.012

CRNRate 4 14.7468 17.8967 4.4742 8.68 0.000

PopRate 2 11.7484 8.6336 4.3168 8.37 0.000

CRNRate*PopRate 8 12.2730 12.2730 1.5341 2.98 0.003

Error 436 224.7633 224.7633 0.5155

Total 453 325.8612S = 0.717992 R-Sq = 31.02% R-Sq(adj) = 28.34%

All of the factors were statistically significant in their effect on trend. I ran the analysis without the covariates and the R-sq value dropped dramatically. Since the design was unbalanced, adjusted sums of squares were used for testing the various effects – each factor is tested after all of the other factor effects have been accounted for. The diagnostics for the analysis indicated no obvious problems with assumptions.

The adjusted means from Minitab (Trend is in degrees C) and standard errors:

Least Squares Means for Trend

CRNRate Mean SE Mean

1 0.3566 0.16387

2 0.5182 0.11984

3 0.4942 0.09038

4 0.7235 0.06998

5 1.1844 0.10039

PopRate

A 0.4080 0.07341

B 0.6576 0.09481

C 0.9005 0.09494

CRNRate*PopRate

1 A 0.3649 0.27302

1 B 0.4159 0.32208

1 C 0.2891 0.25700

2 A 0.2806 0.15719

2 B 0.7202 0.25422

2 C 0.5537 0.19970

3 A 0.3340 0.10186

3 B 0.5060 0.15463

3 C 0.6427 0.19473

4 A 0.5160 0.05653

4 B 0.4242 0.09331

4 C 1.2302 0.18110

5 A 0.5446 0.13989

5 B 1.2217 0.15156

5 C 1.7869 0.21375

The adjusted means here are calculated by replacing the covariate value (lat, long, and Elev) by the average for that variable. Population is accounted for by assuming that 1/3 of the stations in each CRN rating are of each population level, A, B, C. Individual pairwise comparisons of the 5 CRN rating means showed that each of level 1 to 4 differed from level 5 significantly, but the four did not show a significant difference among themselves. Graphically, the above adjusted means look like:

http://www.math.unb.ca/~roman/graphs/mainfull.jpg

and

http://www.math.unb.ca/~roman/graphs/interfull.jpg

Since the actual percentages of A, B and C’s in the sample were actually 59.69%, 26.43% and 13.88% respectively, I also calculated adjusted means with these percentages with the and compared these to both the previous adjusted means and the unadjustedmeans of the trends:

http://www.math.unb.ca/~roman/graphs/adjusttrend.jpg

At Ken’s request, I also ran a similar analysis after combining the station ratings into two categories: CRN123 and CRN 45, and the population also into two categories: (AB) and C:

Factor Type Levels Values

CRNGroup fixed 2 123, 45

PopGroup fixed 2 AB, CAnalysis of Variance for Trend, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P

lat 1 41.784 36.526 36.526 67.99 0.000

long 1 20.047 7.730 7.730 14.39 0.000

Elev(ft) 1 0.500 1.972 1.972 3.67 0.056

CRNGroup 1 4.179 14.185 14.185 26.40 0.000

PopGroup 1 12.357 13.199 13.199 24.57 0.000

CRNGroup*PopGroup 1 6.848 6.848 6.848 12.75 0.000

Error 447 240.148 240.148 0.537

Total 453 325.861S = 0.732969 R-Sq = 26.30% R-Sq(adj) = 25.31%

Least Squares Means for Trend

CRNGroup Mean SE Mean

123 0.4645 0.07165

45 0.9997 0.07461

PopGroup

AB 0.4751 0.04098

C 0.9891 0.09472

CRNGroup*PopGroup

123 AB 0.3927 0.06902

123 C 0.5363 0.12554

45 AB 0.5575 0.04405

45 C 1.4419 0.14300

Elevation plays a smaller role here. It is also pretty obvious that, as a group, the urban stations with rating 4 and 5 stations differ substantially from the others. Plots are not included since there are only two categories for each factor and the numbers above are simple enough to digest.

If the tables post in an unreadable fashion, you can download a pdf document containing some of the computer output (and the graphs) here:

]]>But for Hansen’s method to work as advertised, he has to have a reference core of CRN1-2 stations and a system of identifying them – a point that John V frustratingly ignored.

I don’t understand. Please clarify.

]]>