Statistical Tests for the Normality of the Distribution of Annual Changes in Global Average Temperatures

applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley USA

Statistical Tests for the Normality of the
Distribution of Annual Changes
in Global Average Temperatures

The accepted record of average global temperatures is given below:

The temperature plotted on the vertical axis is the deviation of each year's average temperature in degrees Celsius from the average temperature for the period. This is called the temperature anomaly. The average temperature for the period is 14°C. (The use of average annual temperatures tends to muckup the statistical analysis but it is necessary to work with such averages at least for the preliminary analysis.)
A perplexing aspect of the global temperature data is that there is no measure of accuracy associated with each datum. Surely the earlier years with their fewer weather stations and less accurate instruments have less accurate values than the later years. However systematic but constant bias in the measurements is not really an issue. The concern is not with the level of the temperature but with the change in the level of the temperature. Systematic bias as long as it does not change will not affect the changes in temperature. Thus the improper placement of a measuring station results in a bias but as long as it does not change it is unimportant. But any changes in the number and/or locations of measuring stations could create the appearance of a spurious trend. Thus the shutting down of hundreds of high latitude weather stations by Russia in the 1990's for budgetary reasons would be cause for concern about any appearance of trends in the temperature data.
There do appear to be trends. From 1855 to about 1870 there is an upward trend, then from 1870 to 1910 a downward trend. Without any obvious explanation from 1910 there is an upward trend that continues until about 1945. After 1945 the the trend is downward until about 1975. Since 1975 the trend has been upward. As the climatologist Patrick J. Michaels has pointed out the slopes of the trends from 1910 to 1945 and from 1975 onward are about the same. Moreover the slopes of the downward trends from 1870 to 1910 and from 1945 to 1975 are also about the same. The initial upward trend from 1855 to 1870 could be perceived as having the about the same slope as the two later upward trends.
Since variables which are the cumulative sums of random disturbances appear to have trends even when the random disturbances have an expected value of zero it is unwise and unsound to extrapolate any apparent trends for such variables. The temperature of the Earth's surface is thermodynamically the cumulative sum of the net heat inflow to it. The question is whether or not the net heat inflow is a random (stochastic) variable. This can be judged by looking at the changes in temperature from year to year. These changes are shown below.

The data viewed in this form do not show any obvious trends. A regression line for a trend in the changes is barely perceptible because it is so close to the horizontal axis. The t-ratio (regression coefficient divided by its standard deviation) for the regression slope is a miniscule 0.01, definitely not significantly different from zero at the 95 percent level of confidence.
Another way of examining the temperature change data is to construct a frequency distribution (histogram). Here the temperature changes are grouped into temperature change intervals of 0.05°C width.

The average temperature change is 0.0055°C per year and that is equivalent to 0.55°C per century. The t-ratio for that change is 0.53 and not significantly different from zero at the 95 percent level of confidence. It is notable that the distribution looks more or less like a normal distribution. This is as would be expected from the Central Limit Theorem which says that some quantity which is the sum of a large number of independent random influences will have a frequency distribution which is closer to a normal distribution the larger the number of independent influences. This lends credence to the notion that the year-to-year temperature changes are stochastic (random).
The normal distribution is a stable distribution, meaning that if variable x and y have normal distributions then x+y will have a normal distribution. The normal distribution is not the only stable distribution. There is a whole family of stable distributions characterized by a set of four parameters. The stable distributions are identified in terms of the parameters of their characteristic functions.
The above frequency distribution is converted into an estimate of the probability distribution by dividing the frequency by the total number of observations. The values for the intervals are assumed concentrated at the midpoints of the intervals. The value of the characteristic function for a frequency ω is computed by the following procedure. The justification of this procedure is given in Characteristic Function.

summing the products of the probability p(x) with the cosine of ωx and with the sine of ωx. These sums are the real and imaginary components of the characteristic function for ω. Thus for ω=1 the characteristic function is X+iY=0.992426408+i0.005136492.
the logarithm of the characteristic function x+iy=ln(X+iY) is given by y=tan^-1(Y/X) and x = ln(X) - ln(cos(y)). Thus for ω=1, y=0.005175645 and x=-0.007596688 and therefore -x=0.007596688 and the ln(-x)=-4.880042941.

Compiling the values of ln(-x(ω)) for several values of ω gives

ω ln(ω) x(ω ln(-x(ω))

1.0 0.0 -0.007596688 -4.880042941

2.0 0.693147181 -0.03041915 -3.492682924

4.0 1.386294361 0.022911947 -2.102105229

8.0 2.079441542 0.065389222 -0.698199816

16.0 2.772588722 0.601453768 0.793812289

The plot ln(-x(ω)) versus ln(ω) from the above data is shown below.

For a stable the distribution the data points should lie on a straight line. To a very close approximation they do so. The value of α is given by the slope of this line. Its value is 2.046410701 which is very close to the value of α for a normal distribution which is 2.0. This value is however beyond the allowable range for stable distributions which only goes up to 2.0. It is however just a little beyond the allowable range.
The value of ν is found from

ln(ν)=ln(-x(1))/α=-4.880042941/2.046410701=-2.384684042
and thus
ν=0.09211808.

For a normal distribution ν is equal to the standard deviation. For other stable distributions the standard deviation is infinite. The parameter ν is a dispersion index for the distribution. The standard deviation computed from the sample of global temperature changes is 0.122, but if the sample is from a distribution which does not have a finite distribution then the sample standard deviation will not have an expected value. It will be essentially a random number.
Using the values for ω=1 and ω=16, a pair of equations to be satisfied by δ and β is

δ(1) + β(0.09211808)(1.0)^2.046410701(0.073028453) = 0.005175645
δ(16) + β(0.09211808)(16.0)^2.046410701(0.073028453) = 0.601453768
which reduce to
δ + β(0.006727241) = 0.005175645
δ + β(1.958668781) = 0.601453768

Subtracting the first equation from the second and dividing by the coefficient of β gives β=0.305479499. Then

δ = 0.601453768-(0.305479499)(1.958668781))=0.003120611.

The parameter β indicates the skewness of the distribution. For a normal distribution the skewness is zero. The value of β= 0.305479499 indicates that there is skewness to the right. Thus the distribution is not normal and does not have a finite standard deviation.
The value of δ gives the expected value for the distribution. The value of δ=0.003120611 is the average temperature increase per year. This is 0.31°C per century.

The following histogram is based upon a sample of 2000 observations of a random variable which has a stable distribution characterized by the values of the parameters shown. Each time the image is refreshed a new sample of 2000 observations is drawn.

Below is a display that shows what temperature would look like when the changes in temperature from one period to the next is a random value with a stable distribution similar to that found for the actual global temperature changes. A sample of 200 random increments is chosen and the cumulative sum computed. The values are scaled so as to fit the maximum value within the display area. Each time the screen is refreshed a new sample of 200 random values is drawn.

Note that the series almost always appears to be following a trend. However since the value of δ is 0 there is in reality no long term term.
Conclusion

Although the distribution of annual changes in global average temperatures is close to a normal distribution it differs from a normal distribution in two essential ways. First it is skewed to the right. Second, because of this skewness it does not have a finite standard deviation and thus any sample estimates of the standard deviation of annual changes is meaningless. The expected increase in average global temperature from the analysis is 0.31°C per century. This is of the same order of magnitude of other empirical estimates of the trend in average global temperature.

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins


ω	ln(ω)	x(ω	ln(-x(ω))
1.0	0.0	-0.007596688	-4.880042941
2.0	0.693147181	-0.03041915	-3.492682924
4.0	1.386294361	0.022911947	-2.102105229
8.0	2.079441542	0.065389222	-0.698199816
16.0	2.772588722	0.601453768	0.793812289

ln(ν)=ln(-x(1))/α=-4.880042941/2.046410701=-2.384684042 and thus ν=0.09211808.

δ(1) + β(0.09211808)(1.0)2.046410701(0.073028453) = 0.005175645 δ(16) + β(0.09211808)(16.0)2.046410701(0.073028453) = 0.601453768 which reduce to δ + β(0.006727241) = 0.005175645 δ + β(1.958668781) = 0.601453768

δ = 0.601453768-(0.305479499)(1.958668781))=0.003120611.

Conclusion

ln(ν)=ln(-x(1))/α=-4.880042941/2.046410701=-2.384684042
and thus
ν=0.09211808.

δ(1) + β(0.09211808)(1.0)^2.046410701(0.073028453) = 0.005175645
δ(16) + β(0.09211808)(16.0)^2.046410701(0.073028453) = 0.601453768
which reduce to
δ + β(0.006727241) = 0.005175645
δ + β(1.958668781) = 0.601453768