The Derivation of the Uncertainty Principle

San José State University

applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley U.S.A.

The Derivation of the Uncertainty Principle

Werner Heisenberg
In 1927 the German physicist, Werner Heisenberg, articulated the principle that the more precisely the position of a particle is known the less precisely is known its momentum and vice versa. This was called the Uncertainty Principle. This version does not lend itself to rigorous derivation.
More precisely the Uncertainty Principle is that if σ_x is the standard deviation of the location of the particle and σ_p is the standard deviation of its momentum then
σ_x·σ_p ≥ ½h

where h is Planck's constant divided by 2π.
Crude Illustrations of
the Uncertainty Principle

Light Diffracted
through a Slit

Although properly for the Uncertainty Principle the uncertainty of a variable is represented only by its standard deviation there are other measures of uncertainty, such as a variable's range, that can be used to illustrate it. In such illustrations however the lower limit on the product may be some multiple of ½h rather than ½h itself.
Assume that light of wavelength λ is passing through a slit of width a. By the de Broglie relation the momentum p of a photon in this light is
p = h/λ

where h is Planck's constant.
As a result of passing through the slit the momentum of a photon is not changed in magnitude but it may be changed in direction. The component of momentum in the direction of the slit, say p_y, is p·cos(θ) where θ is the angle of diffraction for the photon. There is a distribution of diffraction angles that involves maxima and minima. There is a peak at θ=0. The nearby minima occur at θ=±π/2. Since the angle of diffraction is given by
θ = λ/a

the wavelengths associated with the nearby minima are λ=±aπ/2.
Since for small θ, cos(θ)≅(1−θ) the nearby minima for the distribution of the diffraction angle are at θ=±(λ/a).
Thus θ ranges from −(λ/a) to +(λ/a) so its range is 2(λ/a), which can be taken to be a measure of the uncertainty in cos(θ).
Therefore
Δ(p_y) = pΔ(cos(θ)) = p(2λ/a)
but
p = h/λ
and hence
Δ(p_y) = (h/λ)(2λ/a) = 2h/a

The uncertainty in the location y of the photon may be taken to be the width of the slit, a; i.e., Δy=a. Thus
Δy·Δp_y = a(2h/a) = 2h ≥ ½h

Observation of a Particle
with a Microscope

Consider a particle Q on plate of a microscope where the angle subtended by the objective lens of the microscope is θ. The plate is illuminated from below by light of wavelength λ. A photon of the light has a momentum of p which is equal to h/λ. The particle is observed when a photon is scattered by the particle and then enters the objective lens and is then guided by the optics of the microscope into the eye of the observer.
The direction of the momentum of the photon is changed by its collision with the particle. The momentum of the photon is uncertain but for the photon to be observed by the observer its direction has to be within the angle θ subtended by the objective lens from the viewpoint of the particle. This means that the x-component of the momentum of the photon has to be between +p·sin(θ/2) and −p·sin(θ/2).
The particle experiences a recoil from the collision of photon with it that results in the x-component of its momentum, p_x, also being between −p·sin(θ/2) and +p·sin(θ/2). The range of the x-component of the momentum of the particle is 2p·sin·(θ/2). Hence
Δp_x = 2p·sin(θ/2) Δp_x = 2(h/λ)sin(θ/2)

What is observed at the eye-piece of the microscope is the diffraction pattern of the photons scattered by the particle. The width of that diffraction pattern is λ/sin(θ/2). This is called the resolving power of the microscope. Thus
Δx = λ/sin(θ/2)

The product of the ranges of the location x and the x-component of the momentum of the particle is then
Δp_x·Δx = 2(h/λ)sin(θ/2)(λ/sin(θ/2)) = 2h
which of course is greater than ½h

A Proper Derivation of
the Uncertainty Principle

A quantum mechanical system is characterized by a complex function defined over space called its wave function. The wave function is such that its squared magnitude is equal to the probability density for the system. That is to say, if ψ(X) is the wave function value for a particle at the point X then the probability density at X is |ψ(X)|²=ψ(X)ψ(X), where ψ(X) is the complex conjugate of ψ(x).
For a system with a wave function ψ(X) the expected value of a variable f(X) for the system is given by
E(f(X)) = ∫f(X)|ψ(X)|²dX = ∫ψ(X)f(X)ψ(X)dX

In quantum mechanics the expected value of a variable f(X) is denoted as <f>.
The Schroedinger Equation

Erwin Schroedinger
The wave function for a system is found as a solution to its Schroedinger equation. The Schroedinger equation for a system is derived from its Hamiltonian function. The Hamiltonian function of a system is its total energy, kinetic plus potential, with the kinetic energy expressed as a function of its momentum.
In mathematics an operator is a function defined over a set of functions and whose values are functions. Such a function of functions is also called a functional. The Schroedinger equation for a system is derived from its Hamitonian Operator, which is its Hamiltonian function with momentum being replaced by ih∇ where i is the imaginary unit, h is Planck's constant divided by 2π and ∇ is the gradient operator for the system. For a one dimensional system in which the spatial variable is x the gradient operator is ∂/∂x.
If H^ is the Hamiltonian operator for the system and t is time then the Schroedinger equation for the system is
H^ψ = ih(∂ψ/∂t)

Deviations from the Expected Values

The deviation of a variable f(X) from its expected value is expressed
Δf = f(X) − <f>

The expected value of the squared deviations of a variable is called its variance. Let V(f) be the variance of the variable f(X). Then
V(f) = <(Δf)²)> = ∫Δf(X)|ψ(X)|²dX = ∫ψ(X)(Δf)²ψ(X)dX

A One Dimensional System

Let x be the spatial dimension and p the linear momentum for a system. Consider the deviations Δx and Δp. The operator corresponding to Δp is denoted ΔP.
The Schwartz Inequality

Let f and g be two complex functions over the variable x. The Schwartz Inequality is then
[∫|f|²dx]·[∫|g|²dx] ≥ |∫fgdx|²

In the Schwartz Inequality let f be Δx·ψ and g be ΔP·ψ. Then, since Δx is real and thus the same as its conjugate,
V(x)·V(p) ≥ |∫ψΔx·ΔPψ|²

The integral ∫ψΔx·ΔPψ is equivalent to <Δx·Δp>. Thus
V(x)·V(p) ≥ |<Δx·Δp>|²

The commutator of two operators, R and Q, is defined as
[R, Q] = RQ − QR

Thus an operator RQ can be expressed
RQ = RQ − ½QR + ½QR
= ½RQ − ½QR + ½RQ + ½QR
which is equivalent to
RQ = ½[R, Q] + ½(RQ + QR)

For the case of the operator ΔxΔP this means
ΔxΔP = ½[Δx, ΔP] + ½(ΔxΔP + ΔPΔx)

According to Heisenberg's Matrix Mechanics, which will be explained later, the commutator of Δx and ΔP is equal to ih. Thus
ΔxΔP = ½ih + ½(ΔxΔP + ΔPΔx)

The expected value of ΔxΔP is the same as the expected value of ΔxΔp. And likewise the expected values of ΔxΔP and ΔPΔx are the same as those of ΔxΔp and ΔpΔx. Therefore
<ΔxΔp> = ½ih + ½<ΔxΔp + ΔpΔx>

Thus on the RHS of the above equation the term ½ih is purely imaginary and the other term is purely real. Thus the magnitude of their sum is
|<ΔxΔp>|² = ¼h² + ¼<ΔxΔp + ΔpΔx>²

Since the second term on the RHS is nonnegative
|<ΔxΔp>|² ≥ ¼h²
and since
V(x)·V(p) ≥ |∫ψΔx·ΔPψ|²
it follows that
V(x)·V(p) ≥ ¼h²

This is the Uncertainty Principle. Stated in terms of the standard deviations σ_x and σ_p, which are the square roots of the variances,
σ_x·σ_p ≥ ½h

Thus the product of the uncertainties of the location and of the momentum of a particle must be at least as great as ½h.
This derivation is one dimensional. For the special problems of the Uncertainty Principle in the two and three dimensional cases see 2D Uncertainty.
The Uncertainty Principle Is the basis for the notion in the Copenhagen Interpretation of quantum theory that particles exist only as probability density functions and generally are not physically real. This not correct. Particles have time-spent probability distributions which satisfy the Uncertainity Principle. For the demonstration of this proposition for a harmonic oscillator see Oscillator and Uncertainty . The Copenhagen Interpretation confuses the blurriness of a rapidly rotating fan with an intrinic indeterminantness of the fan itself.
The History of the Uncertainty Principle

Werner Heisenberg was often labeled as the wunderkind (wonder child) of physics. Some thought he looked like a farm boy, but he was anything but a rural bumpkin. His father was an academic scholar on the faculty of Munich University. Werner's grandfather was the head of a prestigious gymnasium (secondary school). Werner and his older brother attended that school. After graduation Werner Heisenberg wanted to study mathematics at Munich University but his interview in the mathematics department there did not go well and he was not accepted. His father called upon his old friend, the physicist Arnold Sommerfeld, to accept Werner into the physics program. Sommerfeld was one of the top physicists in the world at that time and Werner was indeed fortunate to undertake his education at Sommerfeld's institution. It happened that Wolfgang Pauli was also studying there and Werner made his acquaintance.
Niels Bohr gave a series of lectures at Göttingen and Sommerfeld and a contingent of his students, including Heisenberg, journeyed there to attend those lectures. Bohr sought out Heisenberg and invited him to come to Copenhagen for a term. Sommerfeld however wanted Heisenberg to study first with Max Born at Göttingen. Heisenberg did so and a lifelong collaboration with Born commenced. After his term at Göttingen Heisenberg went back to complete a thesis for his doctorate. In the defense of his thesis Heisenberg did poorly in explaining some topics unrelated to his thesis and one examiner wanted to fail him, but Sommerfeld negotiated a compromise. Heisenberg was given his doctorate degree but with the lowest possible grade. In disgrace Heisenberg fled to Göttingen. After completing some research on the anomalous Zeeman effect Heisenberg contacted Bohr to tell him of his work. Bohr invited him to visit his institute in Copenhagen for a few weeks. At first Bohr was too busy to spend any significant time with Heisenberg, but then Bohr took time out from his research to spend several days on a hiking tour with Heisenberg. After becoming acquainted with the depth of Heisenberg's talent he invited him to stay at Bohr's institute for an extended period of time. The time was 1924 and Heisenberg was only 22 years old.
In 1925 Heisenberg returned to Göttingen and was disappointed with the lower intellectual level there compared to Copenhagen. He set about trying to explain the intensity of the spectral lines of hydrogen. Bohr's model gave accurate predictions of the frequency and wavelength of those spectral lines but had nothing to say about why some were brighter than others. He did not make much progress in this task and was despondent. Then in June of 1925 he had a severe attack of hay fever because of allergies to pollen. His eyes swelled shut. In desperation he went to the Baltic coast to get away from the pollen. He had a two week leave from Göttingen and decided to spend it on the island of Helgoland.
Freed from his hay fever Heisenberg began to work on the problem of the intensity of the spectral lines of hydrogen. He conceptualized the problem as there being a set of successively higher energy levels for an electron in a hydrogen atom, say {E_k: k=0, 1, 2, …}. When an electron moved from a higher level k to a lower level j radiation of energy (E_k−E_j) is emitted. The frequency of this radiation is such that
hν_kj = E_k − E_j
for k>j

where h is Planck's constant. This much was the standard perception and in it there no reason for one frequency to give any brighter line than another frequency.
His real insight came with the realization that the intensity depended upon the probability of the transition. When the probability is higher more atoms are involved in the transition and hence the spectral line looks brighter.
Somehow this led to a revelation on rock overlooking the Baltic Sea that he needed a type of quantities such that their product depended upon their order in the multiplication. In other words, there are terms, say P and Q, such that P×Q is different from Q×P and hence P×Q−Q×P is not zero.
When he left Helgoland he went immediately to Hamburg to see Wolfgang Pauli. Pauli was encouraging, which was not typical of Pauli for commenting on strange ideas. Pauli once remarked of someone's theory that it was so far afield that wasn't even wrong. With the encouragement from Pauli Heisenberg decided to write up his ideas as an article. At Göttingen he gave a draft of the article to Max Born to read in a short period of time. When Born got to read the article he was enthusiastic about it. The odd sort of multiplication puzzled Born until he realized that the multiplication of square matrices had that property.
Born asked Pauli to help Heisenberg with the mathematics for the article, but Pauli declined because he felt the mathematics would get in the way of the physics. Born then chose Pascual Jordan, a 22 year old graduate student in physics at Göttingen who had prior training in mathematics. Jordan was familiar with the algebra of matrices so he was the perfect choice.
Born and Jordan then worked on extending Heisenberg's ideas. They came up with the formulation that if Q represented the state of a quantum system and P its momenta then
QP − PQ = i(h/2π)I

where juxtaposition represents multiplication, i is the square root of negative one, h is Planck's constant and I is the identity matrix. An identity matrix is one with ones on the principal diagonal and zeroes everywhere else. It serves as the equivalent of unity in matrix multiplications; i.e., for any compatible matrix X, IX=X and XI=X. And, oh yes, the matrices in the above relationship are infinite order so one sees only a middle portion of them. It was heady mathematics.
The body of theory that was developed by Heisenberg, Born and Jordan came to be known as matrix mechanics. It was difficult but useful methodology. Pauli used it to compute the spectrum of hydrogen. People had only praise for matrix mechanics.
Then, seemingly out of nowhere came a superior methodology. Erwin Schrödinger was a well-respected physicist with somewhat of a specialization in optics. He was in his late thirties in contrast to Heisenberg and Jordan who were in their early to middle twenties. Schrödinger had not been involved in the development of quantum theory. What brought him into the field was the idea of Louis de Broglie that particles have a wave aspect. Schrödinger sought out de Broglie's work and read it avidly. It was exciting stuff for someone with an optical orientation. Schrödinger then wrote six articles that developed what he initially called undulatory mechanics, but which subsequently came to be known as wave mechanics. The mathematical basis for wave mechanics was partial differential equations, a field more familiar to physicists than matrix algebra.
Although most praised Schrödinger's wave mechanics Werner Heisenberg was not one of them. In a letter to a colleague he said
The more I think about the physical portion of the Schrödinger theory, the more repulsive I find it. What Schrödinger writes about the visualizability of his theory "is probably not quite right," in other words it's crap.

(Heisenberg's use of the term "probably not quite right" is an allusion to the gentle way Niels Bohr asserted the incorrectness of some statement in physics.)
Schrödinger subsequently wrote an article in which he showed that matrix mechanics and wave mechanics gave the same results in quantum analysis and were therefore equivalent. Later Heisenberg made use of wave mechanics in one of his articles because of the greater ease with which problems can be analyzed.
When Heisenberg was back in Copenhagen with Bohr in he was pondering the matter of beta ray (electron) tracks in a cloud chamber. Heisenberg did not use any notion of electron orbits in atoms because he considered them unobservable. The cloud chamber tracks might indicate that he was wrong about the electron orbits not being observable. He went out for a walk in the park even though it was after midnight and quite dark. Cogitating on the fundamentals of physics he realized that it might not be possible, because of physical interactions, to specify the location and velocity of a particle like an electron. Back in his room he used the example of a microscope investigate the possibility of the simultaneous specification of location and velocity. It was then that he realized the limitation on the accuracy of such specifications. He called the concept the indeterminacy principle, but it subsequently became known as the uncertainty principle.
Heisenberg concluded that concepts like path, trajectory and orbit have no meaning at the quantum level. In late February of 1927 Heisenberg wrote a 14 page letter to Pauli describing his uncertainty principle and its basis. Pauli's reaction was favorable and Heisenberg turned the letter into an article in the early part of March. Bohr was away in Norway on a long vacation at the time. After Bohr returned he read Heisenberg's article. To Heisenberg's surprise Bohr disagreed with Heisenberg's assessment of the source of the uncertainty. Heisenberg thought the uncertainty stemmed from the discontinuities of particle collisions; Bohr thought it was from the dual nature of particle-waves. They ended their discussion still in disagreement. A few days later they talked again. Bohr did not want Heisenberg to publish his article until he had rewritten it. Heisenberg did not want to change anything and finally broke out in tears because of the pressure Bohr was putting on him. Bohr had formulated a concept while on his skiing vacation in Norway that he called complementarity. This was the notion that a particle and its wave were simply manifestation of some more fundamental entity.
Heisenberg sent away his article for publication near the end of March in 1927. It appeared in print at the end of May as a 27 page article. He soon received offers of professorship and he accepted the one from Leipzig University. He left Copenhagen in June of 1927. A copy of Heisenberg's article had been sent to Albert Einstein but Einstein did not respond.
It was not until the 1950's that mathematicians working with computers discovered the concept of a soliton. A soliton is a wave that maintains its shape even after collisions with other solitons. It thus is in the nature of a particle. In all likelihood particles are soliton warpings of space. At the subatomic level there cannot be anything other than warpings of space.
The Uncertainty Principle
and Chaos Theory

Newtonian mechanics indicates that if the locations and velocities of the particles of a system are exactly known at any time then the future locations and velocities can be computed with any desired degree of precision. However Edward N. Lorenz discovered that the solutions to systems of nonlinear dynamic equations can be infinitely sensitive to initial conditions. The deviations resulting from slight deviations in initial conditions can grow at enormous rates such that after a period of time the deviations are substantial. Lorenz was concerned with meteorological prediction and the implication of his discovery is that meteorologist would not be able to make useful forecasts of the weather beyond ten days or so, but it applied to any dynamic system involving nonlinearity. In the case of meteorology the limit of the accuracy of the measurement of initial conditions was the technological problem of the accuracy of the measuring instruments. In the case of the dynamics of systems of particles the limit of accuracy is intrinsic due to the Uncertainty Principle.

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins

σx·σp ≥ ½h

Crude Illustrations of the Uncertainty Principle

Light Diffracted through a Slit

p = h/λ

θ = λ/a

Δ(py) = pΔ(cos(θ)) = p(2λ/a) but p = h/λ and hence Δ(py) = (h/λ)(2λ/a) = 2h/a

Δy·Δpy = a(2h/a) = 2h ≥ ½h

Observation of a Particle with a Microscope

Δpx = 2p·sin(θ/2) Δpx = 2(h/λ)sin(θ/2)

Δx = λ/sin(θ/2)

Δpx·Δx = 2(h/λ)sin(θ/2)(λ/sin(θ/2)) = 2h which of course is greater than ½h

A Proper Derivation of the Uncertainty Principle

E(f(X)) = ∫f(X)|ψ(X)|²dX = ∫ψ(X)f(X)ψ(X)dX

The Schroedinger Equation

H^ψ = ih(∂ψ/∂t)

Deviations from the Expected Values

Δf = f(X) − <f>

V(f) = <(Δf)²)> = ∫Δf(X)|ψ(X)|²dX = ∫ψ(X)(Δf)²ψ(X)dX

A One Dimensional System

The Schwartz Inequality

[∫|f|²dx]·[∫|g|²dx] ≥ |∫fgdx|²

V(x)·V(p) ≥ |∫ψΔx·ΔPψ|²

V(x)·V(p) ≥ |<Δx·Δp>|²

[R, Q] = RQ − QR

RQ = RQ − ½QR + ½QR = ½RQ − ½QR + ½RQ + ½QR which is equivalent to RQ = ½[R, Q] + ½(RQ + QR)

ΔxΔP = ½[Δx, ΔP] + ½(ΔxΔP + ΔPΔx)

ΔxΔP = ½ih + ½(ΔxΔP + ΔPΔx)

<ΔxΔp> = ½ih + ½<ΔxΔp + ΔpΔx>

|<ΔxΔp>|² = ¼h² + ¼<ΔxΔp + ΔpΔx>²

|<ΔxΔp>|² ≥ ¼h² and since V(x)·V(p) ≥ |∫ψΔx·ΔPψ|² it follows that V(x)·V(p) ≥ ¼h²

σx·σp ≥ ½h

The History of the Uncertainty Principle

hνkj = Ek − Ej for k>j

QP − PQ = i(h/2π)I

The Uncertainty Principle and Chaos Theory

σ_x·σ_p ≥ ½h

Crude Illustrations of
the Uncertainty Principle

Light Diffracted
through a Slit

Δ(p_y) = pΔ(cos(θ)) = p(2λ/a)
but
p = h/λ
and hence
Δ(p_y) = (h/λ)(2λ/a) = 2h/a

Δy·Δp_y = a(2h/a) = 2h ≥ ½h

Observation of a Particle
with a Microscope

Δp_x = 2p·sin(θ/2) Δp_x = 2(h/λ)sin(θ/2)

Δp_x·Δx = 2(h/λ)sin(θ/2)(λ/sin(θ/2)) = 2h
which of course is greater than ½h

A Proper Derivation of
the Uncertainty Principle

RQ = RQ − ½QR + ½QR
= ½RQ − ½QR + ½RQ + ½QR
which is equivalent to
RQ = ½[R, Q] + ½(RQ + QR)

|<ΔxΔp>|² ≥ ¼h²
and since
V(x)·V(p) ≥ |∫ψΔx·ΔPψ|²
it follows that
V(x)·V(p) ≥ ¼h²

σ_x·σ_p ≥ ½h

hν_kj = E_k − E_j
for k>j

The Uncertainty Principle
and Chaos Theory