Thayer Watkins
Silicon Valley
& Tornado Alley

The Characteristic Function
of a Probability Distribution

Let z be a stochastic variable and p(z) be the probability density function for z; i.e., the probability of obtaining a value of z between a and b is:


The expected value of any function of z, say g(z), is defined as

E{g} = ∫-∞g(z)p(z)dz.

The expected value of the function exp(iωz) is called the characteristic function for the probability distribution p(z), where ω is parameter that can have any real value and i is the square root of -1. That is to say, the characteristic function of p(z) is

Φ(ω) = E{exp(iωz)} = ∫-∞exp(iωz)p(z)dz.

Note that Φ(0) = ∫-∞p(z)dz = 1.

The characteristic function will generally be a complex function; i.e., Φ(ω) = Χ(ω) + iΥ(ω). Since exp(iωz) = cos(ωz) + isin(ωz) the components of the characteristic function are given by:

Χ(ω) = E{cos(ωz)} = ∫-∞cos(ωz)p(z)dz.
Υ(ω) = E{sin(ωz)} = ∫-∞sin(ωz)p(z)dz.

Thus given a probability distribution p(z) it is a straight forward computation to calculate the real and imaginary components of its characteristic function.

Properties of Characteristic Functions

The crucial property of characteristic functions is that the characteristic function of the sum of two independent random variables is the product of those variables' characteristic functions. It is often more convenient to work with the natural logarithm of the characteristic function so that instead of products one can work with sums. This property of characteristic functions can be represented as follows. If Φx(ω) and Φy(ω) are the characteristic function of independent random variables x and y, respectively, then the characteristic function of a variable that involves taking an observation of x and an observation of y and adding them together is given by:

Φx+y(ω) = Φx(ω)Φy(ω)
and hence
log(Φx+y(ω)) = log(Φx(ω)) + log(Φy(ω))

If two variables are not independent the propposition concerning the characteristic functions involve the characteristic function of the conditional probability distribution.

The process of aggregating data such as combining monthly data to obtain quarterly or annual data is easily presented in terms of characteristic functions. If the smaller unit data are statistically independent then the proposition concerning the characteristic function of the sum of random variables applies.

There is another operation that is often involved with combiningg random variables. Suppose x and y have different probability distributions but they are treated as coming from the same population. In effect the probability distribution of the combination involves the probabilities that an observation came from the x population or the y population. Let these probabilities be represented as Px and Py and let their probility distributions be denoted as fx and fy, respectively. The probability that an observation from the combined population has a value z, fz is:

fz(z) = Pxfx(z) + Pyfy(z)

If the characteristic functions for fx and fy are Φx and Φy then the characteristic function for the combined population is given by:

Φz(ω) = PxΦx(ω) + PyΦy(ω)

The Logarithm of a Characteristic Function

It is usually more convenient to work with the logarithm of the characteristic function, log(Φ(ω)). The logarithm of the characteristic function will also be a complex function with real and imaginary components. The logarithm of a variable W is defined as as w if:

exp(w) = W.

For a complex variable X+iY we must find x+iy such that

exp(x+iy) = X+iY.


= exp(x)(cos(y)+isin(y)
= exp(x)cos(x)+iexp(x)sin(x)

it follows that

X = exp(x)cos(y)
Y = exp(x)sin(y)

Thus the imaginary component y can be determined from:

tan(y) = Y/X and hence y = tan-1(Y/X)

The real component x can then be found from:

x = log(X) - log(cos(y)).

Note that the real and imaginary components of the log-characteristic function are not simply the logarithms of the corresponding real and imaginary components of the characteristic function.

The Characteristic Function as a
Moment-Generating Function

The moments of a probability distribution are the expected values of the powers of the random varible; i.e.,

E{zn} = ∫-∞znp(z)dz
where n = 1, 2, 3,....

The value of n=0 could also be included in this definition. However for n=0 the value is the area under the probability distribution which is by definition equal to unity. Note that for ω=0 the characteristic function must have a value of unity.

The connection between the moments of a probability distribution and its characteristic function is seen from taking the derivative of the characteristic function with respect to the parameter ω. For the first derivative

dΦ(ω)/dω = ∫-∞(ix)exp(iωz)p(z)dz

Thus when ω=0, dΦ(ω)/dω is equal to iE{x}. Likewise for ω=0, d2Φ(ω)/dω2 is equal to i2E{x2}. In general

for ω=0,
dnΦ(ω)/dωn = inE{xn}
for n any positive integer.

However, since Φ(0) = 1, the case of n=0 also fits into this scheme.

The Characteristic Functions of
the Stable Distributions

Paul Lévy found the formula for the characteristic function of all stable distribution. The characterisitc function of a stable distribution must such that the logarithm of the characteristic function Φ(ω) must be of the form:

log(Φ(ω)) = iδω - |νω|α(1 - iβF(ω,α,ν))
F(ω,α,ν)) = sgn(ω)tan(πα/2) if α ≠ 1
= - (2/π)log(|νω|) if α = 1
and where
sgn(ω) = +1 if ω >0
            = 0 if ω = 0
              = -1 if ω < 0

The nature and allowable ranges for the parameters are as follows:

For a normal distribution α=2, β=0, ν is equal to the standard deviation and δ is equal to the mean. Thus the log-characteristic function for a normal distribution is of the form:

log(Φ(ω) = iδω - |νω|2.

Some cases for particular values of the parameters are shown below:

The Discovery of the Stable Distributions

Prior to Paul Levy's mathematical analysis empirical investigators were finding cases in which the histograms of some variable, while generally looking like normal distributions, were deviating from the normal distribution in a systematic manner. For example, the economist Wesley Claire Mitchell in 1915 found that the distribution of the percentage changes in stock prices when compared to the best-fitting normal distribution consistently deviated from the normal distribution as shown below:

This sort of deviation means that there would be too many very small deviations from the average, too many very large deviations and too few moderate deviations. The extreme large changes were of particular interest because those were the cases of stock market booms and busts. Because a higher proportion of the probability was in the tails of the distribution compared with the case of the normal distribution such distributions were called fat-tailed distributions. They were also given a name based upon Greek, leptokurtic.

There are Levy-Pareto stable distributions that are leptokurtic. Furthermore, there is a generalization of the Central Limit Theorem that says that the sum of a large number of independent random variables will have a stable distribution. Thus if some phenomenon such as changes in stock prices or rain from a storm is the result of a large number of independent influences then it would be expected that the distribution would be a stable distribution. If the distribution is a fat-tailed distribution then that fact would account for the unexpected extreme changes in a variables, the sort of occurrences associated with catastrophes.

The Determination the Parameters of
a Stable Distribution from the
Empirical Estimates of Its
Characteristic Function

The real component of the log-characteristic function for a stable distribution is

χ(ω) = - |νω|α = - να|ω|α
and therefore
να|ω|α = - χ(ω)

This last relationship implies that:

log(-χ(ω)) = αlog(|ω|) + αlog(ν)

Thus for a stable distribution the graph of the logarithm of the real component of the log-characteristic function as a function of the logarithm of ω is a straight line, the slope of which is the stability index of the distribution, α.

The value of the logarithm of log(-χ(ω)) when ω=1 is the intercept of the straight line and is equal to αlog(ν). Thus a knowledge of the intercept and the value of α determines the value of ν, the dispersion parameter of the distribution; i.e.,

log(ν) = log(-χ(1))/α

The imaginary component of the log-characteristic function for stable distributions is:

υ(ω) = δω + βν|ω|αF(ω,α,ν)

With α and ν known the values of δ and β can be determined from the imaginary component of the log-characteristic function. The values of δ and β can be found from any two points on the curve; i.e., by solving the linear equations in the two unkowns δ and β:

υ(ω1) = δω1 + βν|ω1|αF(ω1,α,ν)
υ(ω2) = δω2 + βν|ω2|αF(ω2,α,ν)

Thus if a probability distribution is actually a stable distribution it is an easy matter to determine the values of its parameters from its log-characteristic function. The problem is how to properly make the estimates when the true probability distribution is not known but only a sample estimate is available. For this topic see:

The Estimation of the Parameters of the Characteristic Function of a Levy-Stable Probability Distribution.

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins