Wednesday, February 25, 2009

S09, MM07 and Spatial Autocorrelation

S09 suggests that spatial autocorrelation caused spurious results in MM07. In addition S09 points out that using the RSS data the significance of the MM07 results decreases. S09 also demonstrates that spurious correlations can be shown by regressing economic statistics against model driven temperature data. My conclusion is that using standard techniques of spatial analysis these findings as they relate to MM07 appear to be incorrect. There may in fact be other issues with MM07 but I am unconvinced that the arguments in S09 disprove MM07.

As a caveat I have only recently taken a look at spatial analysis. In addition I am hardly an expert in statistics. But the methods for doing this seem straight forward, and it is well implemented in the R language, using the spdep package. There are a number of options to choose on the various functions. I have played around with these without the results changing. I have not obtained any unreported results counter to my conclusions, but readers are free to try other options of course. Once again I have published all my code, and the locations of the data. I am wide open to suggestions and criticism.

It is well understood that the test for spatial autocorrelation in a regression is whether the residuals are spatially correlated. The most common test for this seems to be Moran’s I test. Moran’s I test yields a value between –1 and 1, which indicates the amount and sign of correlation. It also yields a p-value, which indicates whether the correlation is significant.
Running Moran’s I on the main UAH regression the statistic is .01 with a p-value of .24. This shows an insignificant and very small positive spatial correlation. I actually ran the test using three distance weightings schemes 1/x, 1/x^2, and 1/sqrt(x). All showed similar results. I think 1/x is the most standard and I will report the rest of the results in this post using that weight scheme.

The conclusion unsurprisingly is the same as Dr. McKitrick’s follow up paper (unpublished, but available on his website) dealing with spatial autocorrelation of his results.

As a point of interest I ran the Moran’s I test on the results of the regression using the RSS data as discussed in S09. A bit surprisingly this shows more signs of spatial autocorrelation at .029 and significance just below the 95% confidence level at p-value= .053. Later I will show the results of running both the UAH and RSS regressions with regression estimators that take into account spatial autocorrelation.

Turning to the model data I duplicated the results in S09 by running a regression using the modeled tropospheric and surface data. Exactly as in S09 various economic variables showed significance in the regression, although the coefficients are very small.

Running the Moran’s I test on this result showed that the residuals are significantly correlated with location. The statistic is .06 with a p-value < .01. Thus as hypothesized in S09 the spurious results of this regression appear to be caused by spatial autocorrelation. However as I have shown the fact that this regression gets spurious results in no way means that the MM07 results are spurious as shown by the Moran’s I test.

Another interesting result, which is only discussed briefly in MM07 and was not discussed in S09 is that a regression of the surface data without tropospheric data also shows significance for the economic variables. This result would tend to minimize the concern about the choice of satellite data introduced in S09. I duplicated this regression result, but the Moran’s I test shows very significant autocorrelation in the residuals. The statistic is .17 with a p-value < .01. So for the moment at least the regression result isn’t meaningful.

The lagsarlm function in spdep is a regression estimator that takes spatial autocorrelation into account. It yields significance factors for the variables, as well as p-value showing the significance of the overall result. As a start I ran an estimate on the model described above which doesn’t use the tropospheric data. The resulting estimate shows that several of the economic data are significant (all but x). In addition the overall p-value is < .01. Using the Model E data I get an estimate where the coefficients are extremely small but they are significant for several (e,x,p,m). However the p-value of the result is .38 and so the overall result isn’t significant. This then shows that the test is working since we wouldn’t expect the result to be significant as the economic data could have no effect on the model.

Running the UAH model through lagsarlm I get a similar result to the original regression. In addition the p-value is .02 showing that the result is significant. So the UAH model in MM07 holds up under both tests.

Finally I ran the RSS model from S09 . Taking spatial autocorrelation into account raises the significance of the economic variables. Now g, e, p, m, y, and c are all significant. In addition the p-value for the overall regression is .018. Absent any other information then I would conclude that MM07 is demonstrating that surface temperature trends are affected by a set of economic variables.

This is hardly surprising since there are all sorts of ways that this could happen. Urban heat island is only one reason along with various large scale surface changes. I don’t know the significance of this but the Model E data shows the troposphere trend at .16 degrees per decade with the surface at .14 degrees per decade. This is in contrast to the measured data with surface at .30 and the troposphere (both UAH and RSS) at .24. In the model the surface is heating ten percent less than the troposphere, but as measured in the real world the surface is heating 25% faster. Of course all kinds of things could cause this discrepancy including issues with the model data, and with the measurements of either the troposphere or the surface. But it is ironic that this roughly 30% swing is in line with MM07s estimate of the bias in the global surface temperature trend.

Test results can be found here. An ever growing code file can be found here.

Tuesday, February 17, 2009

Update to MM07 and S07 Analysis

[It is 2/19/2009 and the conclusions of this post have changed. I'm not sure what I did before but I now get a somewhat different result than when I first put this up. ]As I was writing my previous post about MM07 and S09 Dr. Schmidt was responding to me at realclimate. Dr. Schmidt says that he picked Top/Right of the four choices which would be the same as MM07. I'm sure he is right, but the results look more like one of the two bottom choices from my analysis.

Anyway the point isn't that important because Dr. Schmidt agreed with me that the best way was to average the four boxes to match the 5x5 degree surface grid. He recalculated the result and put it in his supplementary information. He reported no significant change in results. I actually get a somewhat different result with the regression using RSS data yielding significant correlations with the economic factors.

I have rerun the UAH and RSS analysis using the 5x5 grid, and the results are below. The UAH result matches MM07, but for me the RSS result doesn't completely match S09. In the case of the RSS case some of the significance to economic factors drop, but not as much as stated in S09. I don't know what is causing the difference, as the process for converting the RSS set to temperature trends isn't documented in S09.

[update later on 2/19] As I was looking back at the S09 SI I realized that S09 used an updated surface temperature set in addition to using the RSS data. The RSS data alone doesn't really change the results from MM07 but the two together just push the correlations for some of the economic data above the .05 significance level. I show the results both ways in an attached PDF. The result with the updated surface data and the RSS data looks like S09.

I note that the updated surface data still gets essentially the same results with the 5x5 UAH data.

I have to say that until I understand why the RSS data has such a different standard deviation than both the UAH trop data and the surface data I'm not sure that it is valid to conclude there is an issue with MM07 from the fact that the model doesn't work well with the RSS data. As I mentioned in my previous post the model works well without any tropospheric data at all.

Meanwhile the issue of spatial correlation keeps coming up, so I guess I will look into that as well.

Updated code is here.

Here are the results using RSS and UAH and the 5x5 grid.

A look at MM07 and S09

The publication “Spurious correlations between recent warming and indices of
local economic activity” (S09) renewed my interest in one of the papers discussed “Quantifying the influence of anthropogenic surface processes and
inhomogeneities on gridded global climate data” (MM07). The topic of the accuracy of the surface record seems to come up quite a bit. MM07 demonstrated correlations between various measures of economic activity in a geographic area and the level of measured surface warming. S09 showed that the results of MM07 depended on the choice of satellite measurement. It also demonstrated that some correlations could be found between economic data and climate model results. This would need to be spurious since the climate models are independent of economic factors.

What is interesting about the result of my work so far is the influence on results of algorithm choices. Neither paper explained the details of a key step in their data processing, although through an email exchange Dr. McKitrick pointed me to some old software from his 2005 paper that shows the choice that he made. I think I will show that the choice in this step is not obvious, but it could possibly change at least part of the conclusions of both papers.

In MM07 they looked at the relationship between the trends in satellite measured tropospheric temperatures versus surface station measured temperatures at various locations. The idea was that the satellite record would largely be uninfluenced by economic factors and would serve as a baseline for the surface temperature trends. I note that in the paper they also discuss the fact that the satellite record appears to be somewhat influenced by local economic factors. I noticed this as well in a simplistic look, which I will describe later.

Reproducing the core MM07 data required reading through a STATA script and converting the steps to R as there are details that weren’t present in their paper. This is a great example of why having the code available is useful even if it is in a different language. I would have found it difficult to scale the economic data correctly, and to select the correct rows without this information. Maybe it would be obvious to others.

Once I got passed a little learning curve with R (this is my first time), I was able to quickly reproduce the regression results of both papers using the satellite data. I haven’t yet looked at the results reported in S09 where the output of a climate model was used.

The final data set is 440 rows of latitude and longitude locations with the decadal surface trend from HadCrut, the tropospheric trend from UAH, and a set of climate and economic factors. I will call this the “global” table.

In MM07 they only used the UAH data stating that they didn’t believe that the choice of satellite data sets would matter. Given how easy it would be to do, I think they should have also tested the RSS data. Dr. Schmidt obviously agreed that it should have been tested and created a parallel trend table using RSS. Using the trend calculated and supplied by Dr. Schmidt many of the correlations in MM07 disappear.

I note that in his blog post he says that he also tested updated UAH data as well as different surface temperature sets. The results of those tests are not reported in his paper, but I assume that they were similar to the results in MM07.

The fact that substituting RSS data for UAH data caused a change in the outcome was surprising to me. After many adjustments over the years by each it was my impression that they were very similar. I assumed that this was particularly true on decadal time scales.

In S09 Dr. Schmidt speculated that the difference in results was due to a higher trend in the RSS data. Looking at the supplied data for the 440 points in the global table I noted that the mean trend was identical between RSS and UAH at (.237K/Dec vs .232K/Dec.), so I’m not sure why he wrote that.

What was different between the RSS data and UAH data was the standard deviation. The RSS data had a standard deviation of .13 versus .19 for UAH. This is a pretty big difference relative to the observed values when both are supposed to measuring the same thing.

I had never seen anything written about this type of difference in the two data sets. As a result I wanted to look at the two data sets more closely to try to understand the difference.

As a step to doing this I decided to try to recreate the 440 rows used in both MM07 and S09 using the monthly anomaly data provided by UAH and RSS. It was then that I realized that I had a problem. Looking at the lat and long data in the rows of the global table I realized that they didn’t match the grids in either the UAH or the RSS products.

UAH and RSS produce data on a 2.5x2.5 degree grid of discrete points. For example in the UAH data set that I used latitude begins at -88.75 degrees North and in 2.5 degree increments continues to 88.75. Longitude begins at -178.75 degrees East and continues in 2.5 degree increments to 178.75.

Just as an example, in the first row of the global table the latitude is 52.5, and the longitude is 2.5. That point doesn’t appear on the satellite grid. In fact neither the latitude nor the longitude appears. So the question is which satellite grid cell do I pick in order to calculate the trend? There are in fact four grid cells which surround that data point. I can pick any of them. (Top Left , Top Right, Bottom Left, or Bottom Right)

It also occurred to me that you could average the trend across all four cells, and that might be the “correct” answer since it is my understanding that the surface data is on a 5x5 grid. I haven’t tried this yet, but I intend to do that to see if anything interesting happens. Dr. Schmidt reports that he has done this and that the results are essentially the same as in S09. I want to try it just for fun anyway. (Update 2/17 I have done this and the results are here.)

It seemed to me that the choice between the four was potentially important even given the fact that there is a lot of spatial correlation, and it fact it turned out to affect the results. There is no indication in either paper or supplied supplementary information of which choice was made. Looking at the code pointed to by Dr. McKitrick in his email it looks like he picked “Top/Right” back in 2005 when he created the data set. Dr. Schmidt didn’t respond to a question asking which choice was made in S09. (Update on 2/17 Dr. Schmidt responded and said he had picked Top/Right as well)

Looking at the UAH data (Top/Left) and (Top/Right) give results that look a lot like MM07. They are not identical but I am using updated UAH data, and it is possible that the trend calculations in R are not the same as in STATA. If I use (Bottom/Left) or (Bottom/Right) then the results aren’t as good, but there are still correlations to the economic variables.

In the case of the RSS data (Top/Left) and (Top/Right) give results that are fairly similar to MM07. (Bottom/Left ) and particularly (Bottom/Right) yield results that are more like S09.

So my conclusion so far is that the choice seems to matter to the result, and I don’t know the right way to make the choice. There are other potential choices in computing the trend data from the satellite grid, but I think this is the biggest one. I would very much have liked to see in the supplemental information how this was done in each case.

I want to note that if you simply regress the surface temperature anomaly against the economic factors you still get significant results. Other geographic factors now become important as well, which makes sense because they aren’t being canceled out by the tropospheric data. I can’t think of any reason that there should be higher surface anomalies in areas of higher economic activity. This is another indication of issues in the surface temperature record. This wasn’t discussed in either MM07 or S09. Dr. McKitrick reports that this was discussed in a 2004 paper.

Regressing either satellite trend against the other factors also results in significant correlations although to a lesser degree (if you will pardon the pun). This isn’t so surprising since the lower tropospheric measurements of the satellites might be influenced by a broad range of man made surface changes. This was discussed in MM07 but not commented on in S09.

I want to thank both Dr. McKitrick and Dr. Schmidt for being so responsive to an amateur.

R scripts to see how I arrived at these conclusions can be found here.

A follow up using 5x5 grids is here.

The following is an update after I wrote the original post.

The following is the result I got using Top/Right and the UAH data. This is similar but not identical to MM07

Mean is .2339
Min 1Q Median 3Q Max
-0.85006 -0.11274 -0.00614 0.11036 0.62474

Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.3795953 3.4086411 -1.578 0.11526
myuah_trop 0.9283931 0.0664378 13.974 <>
slp 0.0055687 0.0033623 1.656 0.09841 .
dryTRUE 2.6062329 4.1729660 0.625 0.53260
dslp -0.0024560 0.0041043 -0.598 0.54990
Water -0.0242371 0.0200943 -1.206 0.22842
abslat 0.0001628 0.0009495 0.171 0.86396
g 0.0392948 0.0171258 2.294 0.02225 *
e -0.0027524 0.0004521 -6.088 2.55e-09 ***
x 0.0041382 0.0035775 1.157 0.24803
p 0.3841318 0.1165618 3.296 0.00106 **
m 0.3947718 0.1431373 2.758 0.00607 **
y -0.2987748 0.1114165 -2.682 0.00761 **
c 0.0058406 0.0025162 2.321 0.02075 *
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1764 on 426 degrees of freedom
Multiple R-squared: 0.5441, Adjusted R-squared: 0.5302
F-statistic: 39.11 on 13 and 426 DF, p-value: <>

On Dr. Schmidt has told me that he selected Top/Right when he wrote his paper. The following are the results that I get when I select Top/Right. As you can see I still show significance at the 95% level for several of the economic variables including population. The significance is less than what I saw using the UAH data as posted above.

Mean trend is .2344
Min 1Q Median 3Q Max
-0.992645 -0.115114 -0.008233 0.111465 0.574084

Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.9067505 3.4937453 -1.404 0.16092
myrss_trop 0.9485121 0.0738105 12.851 <>
slp 0.0049539 0.0034460 1.438 0.15129
dryTRUE 4.0677276 4.3150450 0.943 0.34638
dslp -0.0039007 0.0042443 -0.919 0.35859
Water -0.0139312 0.0205568 -0.678 0.49834
abslat 0.0031509 0.0008996 3.503 0.00051 ***
g 0.0479709 0.0174798 2.744 0.00632 **
e -0.0023729 0.0004656 -5.097 5.21e-07 ***
x 0.0058653 0.0037066 1.582 0.11430
p 0.2687887 0.1208100 2.225 0.02661 *
m 0.3385325 0.1474768 2.295 0.02219 *
y -0.2460588 0.1148984 -2.142 0.03280 *
c 0.0066888 0.0025751 2.597 0.00972 **
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1809 on 426 degrees of freedom
Multiple R-squared: 0.5209, Adjusted R-squared: 0.5062
F-statistic: 35.62 on 13 and 426 DF, p-value: <>

I don't know why I am getting a different result than S09. It would be nice to see the code from that paper that processes the RSS data. Maybe I'm making a mistake of some kind. The trend mean looks pretty similar however.