The Explication and Criticism of the article by C.N. Yang and R.L. Mills of the Brookhaven National Laboratory at Upton, New York entitled "Conservation of Isotopic Spin and Isotopic Gauge Invariance"

San José State University

applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley USA

The Explication and Criticism of the
article by C.N. Yang and R.L. Mills of
the Brookhaven National Laboratory at
Upton, New York entitled:

"Conservation of Isotopic Spin
and Isotopic Gauge Invariance"

The Yang-Mills article is one of the most important articles in particle physics. In it Yang and Mills formulate a principle of isotopic gauge invariance. This leads to the formulation of a special field, which they call the b field, which bears the same relationship to isotopic spin that the electromagnetic field has with an electrical charge. They find a set of nonlinear differential equations satisfied by the b field. The oscillation of the b field are particles that have isotopic spin of 1 and a charge of 0 or ±e, where e is the charge of an electron.
The concept of isotopic spin was proposed in 1932 by Werner Heisenberg as a characteristic that distinguished protons and neutrons which were otherwise the same particle. It was thought of as like a switch that could be turned from one position to another. In 1937 isotopic spin was conjectured by Eugene Wigner as a quantity whose conservation explained the interactions and non-interactions involving protons and neutrons.
Yang and Mills rely upon what works for the case of an electromagnetic field as a guide for their formulation of a field associated with isotopic spin.
A particle has a complex valued wavefunction ψ(X) that gives its probability density ψψ*(X) at every point X in space-time. If there is no electromagnetic field present and ψ is multiplied by exp(iα) the value of the probability density (exp(iα)ψ)(ψ*exp(−iα)=ψψ* is unaffected and the physical observation of the particle is also unaffected. Multiplying ψ by exp(iα) is said to be a gauge transformation but it should be called a phase transition. In the absence of an electromagnetic field the gauge α can be chosed arbitrarily and the laws for the dynamics of the particle must be such that they are unaffected by the choice of gauge. This imposes a strong requirement on the physical laws because the gauge can be chosen arbitrarily at every different point in space-time. Thus this is the significance of gauge invariance; i.e., that it imposes a severe limitation on what form physical laws can take. It is not that anyone would actually want to choose a different gauge at every point in space at every moment of time.
The experimental background for isotopic spin is given by Yang and Mills as follows:
Then in 1937 Breit, Condon and Present pointed out the approximate equality of p-p and n-p interactions in the 1S state. It seemed natural to assume this equality holds also in the other states available to both the n-p and p-p systems. Under such an assumption one arrives at the concept of a total isotopic spin which is conserved in nucleon-nucleon interactions.

Here is an editorial note. By relying upon scattering experiments, which involve the interactions only of particle pairs, the physicists limited their investigation to only to the interactions and forces involving spin pair formation. Such interactions involve a fixed binding energy of about 3 million electron volts (MeV). What was left out was the true distance-dependent strong force which is of lesser magnitude and involves repulsion between nucleons of the same type and attraction between nucleons of different types. This strong force force becomes dominant in nuclei involving larger numbers of nucleons and shows up in the levels of binding energy and the limits to the differences in the numbers of each type of nucleon. That is to say, the need for a degree of balance between the numbers of neutrons and and protons is that neutrons are attracted only to protons and the repulsion between neutrons and neutrons and protons and protons can only be overcome by such a balance. The repulsion between protons is stronger than that between neutrons and that is why there is a relative abundance of neutrons in heavier nuclei. Now back to the analysis of Yang and Mills.
Yang and Mills note that if an electromagnetic field is present it is necessary to counteract the variation of the gauge α by requiring that the electromagnetic field A_μ change under a gauge transformation as
A'_μ = A_μ + (1/e)∂α/∂x_μ

They define an isotopic gauge transformation as a rotation of the spin coordinates at each point in space-time. The wave function ψ describing a ½ spin field has two components. A rotation is expressed in terms of a matrix S which is 2×2 with a determinant of 1. The transformation is
ψ' = S^-1ψ
or, equivalently
ψ = Sψ'

By using matrices for the transformations the analysis is set in terms of non-abelian groups because for square matrices S₁ and S₂, S₁S₂ is not necessarily equal to S₂S₁. This non-abelianness is not mentioned in the article but it is what the article is most remembered for. Later Yang and Mills limit the transformations to infinitesimal rotations.
At this point it is important to note that Yang and Mills use the convention that the physical units are such that the speed of light in a vacuum is equal to 1 and that Planck's constant divided by 2π is equal to 1. Furthermore the fourth coordinate of a point in space-time is time multiplied by the imaginary unit i; i.e. x₄=it.
Yang and Mills define a B field in terms of four 2×2 matrices B_μ which for μ=1,2,3 are Hermitian and for μ=4 is anti-Hermitian. They then assume, in analogy with the case of an electromagnetic field, that all derivatives of the wave function ψ appear in the form
(∂_μ−iεB_μ)ψ

Invariance then requires that
S(∂_μ−iεB'_μ)ψ' = (∂_μ−iεB_μ)ψ

This means that the B field components must transform as
B'_μ = S^-1B_μS + (i/ε)S^-1(∂S/∂x_μ)

The second term on the right is in the nature of a gradient.
Yang and Mills then define a tensor F_μν as
F_μν = (∂B_μ/∂x_ν)−(∂B_μ/∂x_ν) + iε(B_μB_ν− B_νB_μ)

The reason for defining this tensor is that it transforms according to
F'_μν = S^-1F_μνS

For further analysis Yang and Mills conjure up a set of three matrices Tⁱ, for i=1,2,3 which they call isotopic spin angular momentum matrices. They then hypothesize that the B field is a linear combination of these T matrices; i.e.,
B_μ = 2b_μ·T

where each b_μ is stated to be a three component vector. Since each B_μ was previously defined to be a 2×2 matrix it appears that the above equation is impossible. There is no way a 3×1 or a 1×3 matrix can generate a 2×2 matrix no matter what the dimensions of T are.
There is effectively no connection between the b field and the preceding B field since the T matrices are undefined. In effect, the analysis really begins here with the hypothesis of the existence of four three-component vectors b_μ.
The b_μ's are used to define a covariant tensor
f_μν = (∂b_μ;/∂x_ν)−(∂b_ν;/∂x_μ) − 2εb_μ×b_ν

The Lagrangian Analysis

The heart of the analysis is the formulation of a Lagrangian density function and the derivation of the dynamics which it implies. Whatever short-comings in the article outside of this matter are irrelevant if this part of the analysis is done properly.
Yang and Mills hypothesize a set of four vectors, b_μ, the first three of which have real components and the fourth has pure imaginary components. From the b vectors a covariant tensor f_μν is defined as a matrix
f_μν = (∂b_μ/∂x_ν) − (∂b_μ/∂x_ν) − 2εb_μ×b_ν

where ε is a parameter to be determined empirically. There is no justification given for this definition, particular the vector product of the b vectors.
Yang and Mills say that the Lagrangian density function that would be justified in analogy with the electromagnetic field is −¼f_μν·f_μν but that they will use the following
L = −¼f_μν·f_μν − ψγ_μ(∂_{μ_{−iετ·b_{μ_{)ψ − mψψ}}}}

where ψ denotes the transpose of the conjugate of the two-component wave function ψ(x). There is no explanation for the vector τ which appears in the definition, nor the terms γ_μ which might be the Pauli spin matrices. They say that the additional terms in this Lagrangian density function represent a field with isotopic spin of ½.
Yang and Mills then give two equations of motion, which presumably are the Euler-Lagrange equations associated with the Lagrangian density function.
(∂f_μν/∂x_ν) + 2ε(b_ν×f_μν) = 0 + iεψγ_μτψ

γ_μ(∂_μ−iετ·b_μ)ψ + mψ = 0

Presumably the first equation is the Euler-Lagrange equation for the space-time variables x_μ but the source and nature of the second and third terms is a mystery. It is not as though this is obvious. The source of the second equation is also left as a mystery as well. It apperars to involve the derivative of the Lagrangian density function with respect to ψ but why that would be appropriate is also a mystery.
Yang and Mills define the third term of the first equation as an isotopic spin current J_μ; i.e.,
iεψγ_μτψ = J_μ

They say that
(∂J_μ/∂x_μ) = − 2εb_μ×J_μ

By defining
J_μ = J_μ + 2εb_ν×f_μν

they get an equation of continuity
(∂J_μ/∂x_μ) = 0

which implies a conservation of total isotopic spin. This is all plausible but is dependent upon an equation produced without derivation from a purely hypothesized Lagrangian density function.
Yang and Mills then go on to impose a remarkably strong condition which they label as a merely a supplemental condition. That condition is
∂b_μ/∂x_μ = 0

This condition is that the divergence of the b field is zero. It seems a little late in the analysis to impose such a strong condition.
Quantization

Yang and Mills consider the matter of quantization and present some very complex equations but these equation do not seem to imply anything of empirical significance.
Properties of the b Quanta

In the final section of the article Yang and Mills present empirically significant proposition.
The quanta of the b field clearly have spin unity and isotopic spin unity. We know their electric charge too because all the interactions that we propose must satisfy the law of the conservation of electric charge, which is exact. The two states of the nucleon, namely proton and neutron, differ by charge unity. Since they can transform into each other through the emission of a b quantum, the latter must have three charge states with charges of ±e and 0.

It is hard to see how this proposition is derived from their analysis. It seems more like an empirical fact tacked onto the analysis.
Yang and Mills then go on to the matter of mass. They say,
We next come to the question of the mass of the b quantum, to which we do not have a satisfactory answer. One may argue that without a nucleon field the Lagrangian would contain no quantity of the dimension of a mass, and that therefore the mass of the b quantum in such a case is zero. This argument is however subject to the criticism that, like all field theories, the b field is beset with divergences, and dimensional arguments are not satisfactory.

Editorial Comment

It is difficult to believe that those who have had praise for the Yang-Mills article have ever read it in detail. It seems that the profession liked the goal and the spirit of it and assumed without question that the analysis was valid and presented correctly.
(To be continued.)

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins
A

A'μ = Aμ + (1/e)∂α/∂xμ

ψ' = S-1ψ or, equivalently ψ = Sψ'

(∂μ−iεBμ)ψ

S(∂μ−iεB'μ)ψ' = (∂μ−iεBμ)ψ

B'μ = S-1BμS + (i/ε)S-1(∂S/∂xμ)

Fμν = (∂Bμ/∂xν)−(∂Bμ/∂xν) + iε(BμBν− BνBμ)

F'μν = S-1FμνS

Bμ = 2bμ·T

fμν = (∂bμ;/∂xν)−(∂bν;/∂xμ) − 2εbμ×bν

The Lagrangian Analysis

fμν = (∂bμ/∂xν) − (∂bμ/∂xν) − 2εbμ×bν

L = −¼fμν·fμν − ψγμ(∂μ−iετ·bμ)ψ − mψψ

(∂fμν/∂xν) + 2ε(bν×fμν) = 0 + iεψγμτψ γμ(∂μ−iετ·bμ)ψ + mψ = 0

iεψγμτψ = Jμ

(∂Jμ/∂xμ) = − 2εbμ×Jμ

Jμ = Jμ + 2εbν×fμν

(∂Jμ/∂xμ) = 0

∂bμ/∂xμ = 0

Quantization

Properties of the b Quanta

Editorial Comment

A'_μ = A_μ + (1/e)∂α/∂x_μ

ψ' = S^-1ψ
or, equivalently
ψ = Sψ'

(∂_μ−iεB_μ)ψ

S(∂_μ−iεB'_μ)ψ' = (∂_μ−iεB_μ)ψ

B'_μ = S^-1B_μS + (i/ε)S^-1(∂S/∂x_μ)

F_μν = (∂B_μ/∂x_ν)−(∂B_μ/∂x_ν) + iε(B_μB_ν− B_νB_μ)

F'_μν = S^-1F_μνS

B_μ = 2b_μ·T

f_μν = (∂b_μ;/∂x_ν)−(∂b_ν;/∂x_μ) − 2εb_μ×b_ν

f_μν = (∂b_μ/∂x_ν) − (∂b_μ/∂x_ν) − 2εb_μ×b_ν

L = −¼f_μν·f_μν − ψγ_μ(∂_{μ_{−iετ·b_{μ_{)ψ − mψψ}}}}

(∂f_μν/∂x_ν) + 2ε(b_ν×f_μν) = 0 + iεψγ_μτψ

γ_μ(∂_μ−iετ·b_μ)ψ + mψ = 0

iεψγ_μτψ = J_μ

(∂J_μ/∂x_μ) = − 2εb_μ×J_μ

J_μ = J_μ + 2εb_ν×f_μν

(∂J_μ/∂x_μ) = 0

∂b_μ/∂x_μ = 0