I had quite an interesting time replicating the calculations from "A Semi-Empirical Approach to Projecting Future Sea-Level Rise" (Science 1/19/2007 R07) by Stefan Rahmstorf. The general concept in the paper is very simple. He estimates a linear equation for the rate of sea level rise based on global temperature from measured data. The difficulty came in because the paper relies on an algorithm that is not generally available. To demonstrate this it seems that although there were two critical comments on his paper published in Science neither replicated the original calculations of R07. It only became possible to replicate the results when Dr. Rahmstorf published his code as part of his response to those comments. Even then you had to do a little hunting. In this post I will show that I have duplicated those results, so that my subsequent comments make sense.
R07 is quite simple in concept. His theory is that the rate of sea level rise has a linear correlation with surface temperature. The idea is that you start with a situation where sea level is stable. Then you raise the temperature causing sea level to rise until is reaches a new equilibrium. I won't explain the whole paper, but this makes perfect sense. There are, of course, multiple inputs to sea level rise, but pretty much all of them should respond to a change in surface temperature, the question is how fast and how much.
To estimate this linear equation he uses sea level data from Church and White (IPCC), as well as the GISS global temperature record. But this is where things get a little tricky. He doesn't do a simple regression on the temperature and sea level because this data is "noisy." Instead runs both the temperature and the sea level data through a process to separate the trend from the noise. This isn't explained in the main text of the paper, but it is referred to in the text below figures 2 and 3. "Both temperature andsea-level curves were smoothed by computing nonlinear trend lines, with an embedding period of 15 years (14)." He uses the word "smoothed" but this is not a typical smoothing algorithm. Use of a typical polynomial fit will not get as "good" a result, as I discovered in early attempts.
I might have discovered this more quickly if I had read all the way to the bottom of Dr. Rahmstorf's reply to comments where a link to the code and data was published. The code and data weren't linked from the original paper which is where I started. Both published comments also didn't have the benefit of this code and data which is clear from reading them, and I guess after that everyone called it a day. But I will get to this stuff in a later post.
Reference 14 from the paper is "J. C. Moore, A. Grinsted, S. Jevrejeva, Eos 86, 226 (2005)." This paper is titled "New Tools for Analyzing Time Series Relationships and Trends." It is actually a short review article of several new mathematical techniques. At first glance it certainly isn't immediately apparent that this tells us how the "smoothing" was done. But there is a section titled "Nonlinear Trends in Sea Level and Temperature." This section refers to the use of Single Spectrum Analysis (SSA) to extract a nonlinear trend. It refers to Ghil et. al. 2002. So this is at best an indirect reference.
At this point I still hadn't discovered the code at the bottom of the reply, but I did have a lead. So I looked into SSA and discovered software at UCLA that was available. To make a long story short SSA is necessary but not sufficient to duplicate the R07 results. In fact further searches, and eventually the code led me to the fact that R07 relied on a matlab function called ssatrend.m. Dr. Rahmstorf on Real Climate indicated that this was available from Aslak Grinsted.
I wrote Dr. Grinsted who wrote me back very promptly, and sent me the source code to ssatrend.m. He also commented that he had no idea how Dr. Rahmstorf had gotten a copy of it, and that he had never meant for it to be distributed. I think that he was just concerned about it being unsupported. My own view is that it is pretty strange to use some random piece of code in a published paper without making the code your own. Especially where, as in this case, the strength of ther result depends on using this specific algorithm. It isn't available from any of the usual Matlab repositories which is the first place I looked.
In any event with a little effort I found an SSA algorithm written in R on source forge. I checked its results agains the UCLA program and they were identical so I knew the foundation was good. Then with some effort, I translated ssatrend.m into R so that I could run it on an open platform.
Finally I translated the code supplied in the reply written by Dr. Rahmstorf into R so that I could at see if I was starting from the same place. This was successful.
Using the supplied input files for sea level and temperature I get a correlation coefficient (R) of .88 exactly as reported in R07. In addition the following graphs are identical to figures two and three (top) of R07. The code is here.
At the end of the day the use of this unpublished algorithm made this much harder than it should have been as the underlying concept is very simple. I have no idea how a reviewer could have evaluated whether the use of this algorithm made sense. Having said that I don't really see any problem with it, and I look forward to trying ssatrend with other data.
But now that I can get the exact results from the paper, I have a couple of comments that go beyond what has already been written in Science.