San José State University
Thayer Watkins
Silicon Valley
& Tornado Alley

An Introduction to Tensor Analysis


This is an introduction to the concepts and procedures of tensor analysis. It makes use of the more familiar methods and notation of matrices to make this introduction. First it is worthwhile to review the concept of a vector space and the space of linear functionals on a vector space.

A vector space is a set of elements V and a number of associated operations. There is an addition operation defined such that for any two elements u and v in V there is an element w=u+v. Furthermore there is an element of V, call it the zero vector 0, such that for any element u of V, u+0=u. And for any element u of V there is an element of V, say v, such that u+v=0. The element v is called the additive inverse of u and is denoted as −u. There is a set of scalars Λ for a vector space such that for any element v of V and any element λ of Λ there is an element of V, denoted as λv and called the scalar product of λ and v. Usually the set of scalars is the real numbers or the complex numbers.

For a vector space there is set of elements of V, called a basis, such that any vector in V can be reprsented as a linear combination of the basis elements. The linear combination is given by the scalar coefficients for the basis elements. Thus any vector in V can be represented by an ordered array of scalar elements, which are called the components of the vector. Such an order array can be considered a column vector of the scalar elements. The basis of a vector space may or may not be finite.

Linear functionals on the vector space are functions which map the elements of V into the set of scalars Λ. Addition and scalar multiplication on the set of linear functionals are defined and likewise a zero element and additive inverses. Thus the linear functionals over the set of vectors V form a vector space, called the dual space to the vector space for V.

If the vector space for V is of finite dimension n then so is its dual space and thus the dual space has the same structure as that for V. If the vector space for V is not of finite dimensions then the dual space not necessarily of the same structure.

For the vector space of n-dimensional column vectors the dual space may be thought of as also n-dimensional vectors. Then the linear functionals may be considered as matrix products; i.e, if X is an element of V and F is an element of the dual space then the linear functional is the matrix product of the transpose of F wth X; i.e.,

z = FTX

Now consider a change in the coordinate system for V such that the vector of components X gets changed into the vector of components Y by multiplication by an invertible matrix M; i.e.,

Y = MX

The linear functional functional z = FTX gets transformed to z = GTY. Since Y=MX and X=M-1Y, therefore

z = FTM-1Y
and hence
GT = FTM-1
and therefore
G = (M-1)TF = (MT)-1F

The dual vectors get transformed by the inverse of the transpose of the transformation of the vectors. Thus there are some vectors which get transformed by one rule and other vectors which get transformed by an associated alternate rule. A similar arrangement occurs in tensor analysis in which some tensors are called covariant and transform according to one rule and others are called contravarianat and transform according to an alternate but related rule.

It is very tempting to identify the vectors of the dual space as row vectors. This would illustrate how the dual space could have the same mathematical structure as the primal (original) space yet be distinct. However there is no provision in tensor analysis for identifying one vector as a column vector and another one as a row vector.

The Nature of a Tensor

A tensor is an entity which is represented in any coordinate system by an array of numbers called its components. The components change from coordiate system to coordinate in a systematic way described by rules. The arrays of numbers are not the tensor; they are only the representation of the tensor in a particular coordinate system. A vector is a special case of a tensor. A vector is an entity which has direction and magnitude and is represented by a one dimensional array of numbers. Unfortunately it is common to consider any one dimensional array of numbers as a vector.

A more general notion of a vector involves not only its direction and magnitude but also its point of application. Thus a vector can be represented by two points; one point representing its point of application and the line between the points giving its direction and magnitude. The points in n-dimensional space can be considered invariant but their representations in a coordinate system are given by two n-dimensional arrays of numbers, say X1 and X2. If the coordinate system is changed by shifting the origin (translation), rotating the axes and/or stretching the axes (dilation) then the components of the representation of the points change. The translation of the origin is given by an n-dimensional array, say A and the rotation and dilation by a matrix, say B. The new representations of a point P is Y where

Y = A + BX

where X is the representation in the old coordinate system.

The Allowable Transformations of Coordinate Systems

The allowable transformations are the ones which have inverses; i.e.,

if yi = fi(x1, x2, … , xn) for i=1,2,…,n
then there exists a set of functins {gi: i=1,…,n} such that
if xi = gi(y1, y2, … , yn) for i=1,2,…,n

The conditions for the existence of an inverse transformation can be stated in terms of the Jacobian determinant for the transformation. Let Ji,j=∂yi/∂xj and let J be the matrix formed from the Jij's. The condition for the existence of an inverse transformation within some region R is that det(J)≠0 within that region.

If the x-coordinates are transformed to y-coordinates and the y-coordinates are then transformed to z-coordinates then matrix for the transformation of the x-coordinates to the z-coordinates is just the product of the matrices for the two intermediate transformations; i.e., if J is the matrix for the transformation from the x's to the y's and K is the matrix of the transformation from the y's to the z's then the matrix of the transformation from the x's to the z's is JK. Thus the Jacobian for the transformation from the x's to the z's is det(J)det(K) and hence if det(J) and det(K) are both nonzero in some region R then so is det(JK). Thus if both of the intermediate transformations are allowable then so is their composition.

There is the trivial transformation yi=xi, called the identity transformation. The Jacobian of this transformation is equal to unity.

Since the composition of a transformation with its inverse is just the identity transformation it follows that product of the Jacobian of a transformation with the Jacobian of the inverse transformation is equal to unity and hence det(J−1)=det(J)−1.

All of this adds up to the mathematically interesting proposition that the set of allowable transformations of coordinate systems form a group.

Illustration of a Transformation and its Jacobian Matrix

The transformation from cylindrical coordinates to Euclidean coordinates will be used for the illustration. First these coordinates will be given in their more familiar form and then converted to the subscripted notation. The cylindrical coordinates are (r, θ, z) and the Euclidean coordinates are (x, y, z) where

x = rcos(θ)
y = rsin(θ)
z = z

In subscripted notation then

x1 = r
x2 = θ
x3 = z
y1 = x
y2 = y
y3 = z

Thus the transformation is

y1 = x1cos(x2)
y2 = x1sin(x2)
y3 = x3

Therefore the elements of the Jacobian matrix Ji,j = ∂yi/∂xj are

J1,1 = cos(x2)
J1,2 = −x1sin(x2)
J1,3 = 0
J2,1 = sin(x2)
J2,2 = x1cos(x2)
J2,3 = 0
J3,1 = 0
J3,2 = 0
J3,3 = 1


The Jacobian determinant reduces to x1[cos²(x2) + sin²(x2] and hence J=x1.

In the more familiar notation the Jacobian matrix is


and the Jacobian of the transformation is r.

The Jacobian matrix of the inverse transformation, the inverse of the above matrix, is then


Its determinant is 1/r, which of course is the reciprocal of the determinant of the Jacobian matrix of the original transformation.

(To be continued.)

Let X be a representation of a vector in n-dimensional space and let h=f(X) be a scalar function of X. If the X representation is transformed to another coordinate system by the transformation T(), such that Y=T(X), where T is a vector-valued function then the scalar function f(X) gets transformed into g(Y) where g(Y)=f(T-1(Y)). For simplicity let inverse transformation T-1(Y) be represented as S(Y). Thus g(Y)=f(S(Y)).

Consider the vector whose components are {∂f/∂xi, i=1,…,n}. This is called the gradient of the scalar function f. Now consider the gradient of the scalar function g(Y); i.e., {∂g/∂yj, j=1,…,n}. How are the components of the gradient of g related to the componets of the gradient of f? Calculus gives the answer as

∂g/∂yj = Σi∂f/∂xi(∂xi/∂yj)

Summations arise almost always when there is a repeated index such as i in the above equation. Albert Einstein promoted the convention that since a repeated index almost always means summation over that index the summantion sign Σ can be dispensed with. Therefore in tensor notation the above equation is

∂g/∂yj = ∂f/∂xi(∂xi/∂yj)
or, in better form
∂g/∂yj = (∂xi/∂yj)(∂f/∂xi)

Let (∂xi/∂yj) be denoted as the element of a matrix M and let (∂f/∂xi) and (∂f/∂yj) be the components of the vectors ∇xf and ∇yf, respectively. Then in matrics notation the above transformation of the gradient vector is represented as

yf = M∇xf

Any entity whose components transform in this way is called a covariant tensor of rank 1. There are other entities which transform in a similar but different way. Consider the differentials dyj and dxi for i and j ranging from 1 to n. From calculus

dyj = Σi(∂yj/∂xi)dxi
or, in tensor notation
dyj = (∂yj/∂xi)dxi

The quantities (∂yj/∂xi) are the elements of what previously had been referred to as the Jacobian matrix of the transformation. Denoting the Jacobian matrix as J then the transformation of the vector of infinitesimals dX to dY is represented as

dY = JdX

This transformation is called contravariant and a vector transforming in this way would be called a contravariant tensor of rank 1.

What was denoted above as the matrix M is the Jacobian matrix of the inverse transformation. The covariant transformation of the gradient vectors expressed previously could be represented as

yf = J-1xf

These two rules for transformation, covariant and constravariant, are the defining characteristic of tensors. They can be extended to multiple indices. The indices for covariance are conventionally denoted as subscripts and contravariance as superscripts.

The Metric Tensor

One scalar quantity of importance in geometry is distance. In Euclidean coordinates for an n-dimensional space the formula for the length ds of an infinitesimal line segment is

(ds)² = Σi(dxi
or, using the summation convention,
(ds)² = dxidxi

If dX represents the column vector of dxi's then the formula for the line length could be expressed as

(ds)² = (dX)T(dX)

If the coordinate system is changed from the Euclidean X's to some coordinate system of Y's then

dxi = (∂xi/∂yj)dyj

In matrix notation this may be expressed as

dX = MdY

where M is the matrix whose mi,j element is (∂xi/∂yj). Thus (dX)T=(dY)T(M)T and hence

(ds)² = (dY)T(M)TMdY = (dY)TGdY

where the matrix G is (M)TM. The elements of G are given by

gi,k = (∂xi/∂yj)(∂xk/∂yj)

The two dimensional array of the gi,k's is called the metric tensor. That it is in fact a tensor and a covariant one at that is something that needs to be proven. It is the metric tensor for the coordinate system Y. The metric tensor for the Euclidean coordinate system is such that gi,ki,k, where δi,k=0 if i≠j and =1 if i=k.

For cylindrical coordinates the inverse transform of Euclidean to cylindrical, i.e., cylindrical to Euclidean, is in familiar notation,

x = rcos(θ)
y = rsin(θ)
z =z

Thus the matrix M is given by

m1,1 = ∂x/∂r = cos(θ), m1,2 = ∂x/∂θ = −rsin(θ), m1,3 = ∂x/∂z = 0,
m2,1 = ∂y/∂ = rsin(θ), m2,2 = ∂x/∂θ = rcos(θ), m2,3 = ∂y/∂z = 0,
m3,1 = ∂x/∂z = 0, m3,2 = ∂z/∂θ = 0, m3,3 = ∂z/∂z = 1

Thus the matix M is


and its transpose is


Thus the metric matrix G=MTM is


The length of an infinitesimal line elements is then

(ds)² = (dr)² + (rdθ)² + (dz)²

The inverse of the metric tensor is significant. In the case of the cylindrical coordinate system this inverse G-1 is


In tensor analysis the metric tensor is denoted as gi,j and its inverse is denoted as gi,j. This latter notation suggest that the inverse has something to do with contravariance. For a column vector X in the Euclidean coordinate system its components in another coordinate system are given by Y=MX. Now consider G-1X. Since G=MTM,

G-1 = M-1(MT)-1 = M-1(M-1)T

Thus the transformation of G-1X would be given by

Y = MG-1X = M(M-1(M-1)TX = (M-1)TX
but this is equivalent to
Y = (MT)-1X

This is the transformation rule for a contravariant tensor. It is also the rule for what in the Introduction was referred to as a vector of the dual space. Thus multiplication of a covariant tensor by the contravariant metric tensor creates a contravariant tensor. The operation also works in the other direction. Multiplication of a contravariant tensor by the metric tensor produces a covariant tensor.

Suppose the transformation of the coordinate system is such that Y=MX, then the linear functional H in the new coordinate system corresponding to F in the original coordinate is given by

H = (MT)-1F

Now consider GF where G=MTM. Then

H = (MT)-1GF = (MT)-1(MTM)F
which reduces to
H = MF

This is the transformation rule for a covariant tensor. Thus multiplication of a covariant tensor by the inverse metric tensor produces a contravariant tensor.

The Christoffel Symbols

The Christoffel 3-index symbol of the first kind is defined as

[ij,k] = ½[∂gik/∂xj + ∂gik/∂xi − ∂gij/∂xk]

A property of these symbols obvious from the definition is that

[ij,k] = [ji,k]

When the Chiristoffel symbols of the first kind are multiplied by the elements of the inverse of the metric tensor and results summed over the third index a different set of symbols are generated which are called the Christoffel 3-index symbols of the second kind. There are different notations used for these symbols, but the typographically most convenient is, in analogy with the symbols of the first kind, {ij,k}. I. S. Sokolninikoff in his Tensor Analysis uses a two level symbol impossible to create in HTML. The Princeton school uses a symbol of the form Γijk but the superscripts and subscripts cannot be properly lined up using HTML. Therefore {ij,k} will be used here. The definition is

{ij,k} = gkp[ij,p]
where, of course, there is a
summation over the index p.

It follows from the symmetry of [ij,k] with respect to the interchange of the first two indices that

{ij,k} = {ji,k}

The Christoffel symbols are essential in defining the derivative of tensors. However the Christoffel symbols themselves do not represent a tensor. That is to say they do not transform upon a change in coordinate system according to either the covariant or contravariant rules of transformation.

Differentiation of Tensors

Previously it was noted that the gradient vector for a scalar function is a covariant tensor. The gradient is the set of partial derivatives of a scalar function, say f(x1, …, xn), i.e., (∂f/∂xi) for i=1 to n.

Consider now the second derivatives of the scalar function


When the coordinate system is changed from X's to Y's the gradient components transform according to the tensor formula

∂f/∂yi = (∂f/∂xp)(∂xp/∂yi)

However the formula for the transformation of the second derivatives is

∂²f/∂yj∂yi = (∂/∂yj)((∂f/∂xp)(∂xp/∂yj))
= (∂²f/∂xp∂xq)(∂xq/∂yj)(∂xp/∂yi)
+ (∂f/∂xp)(∂²xp/∂yj∂yi)

If there were only the first term on the right-hand side then the transformation would be tensorial but the second term spoils it.

E.B. Christoffel derived an important formula concerning the second derivatives in a coordinate transformation, namely

(∂²xm/∂yj∂yi) = y{ij,r}(∂xm/yr) − x{pq,m}(∂xp/∂yj)(∂xq/∂yi)

where x{ij,k} and y{ij,k} stand for the Christoffel symbols of the second kind in the X coordinate system and the Y coordinate system, respectively.

The formula for the transformation of any covariant vector A in X coordinates to vector B in the Y coordinates is

Bi = Ap(∂xp/∂yi)
(∂Bi/∂yj) = (∂Ap/∂xq)(∂xp/∂yi)(∂xq/∂yj)
+ Ap((∂²xp/∂yj∂yi)

When the Christoffel formula, with appropriate indices, is substituted for the second term on the right the result is

(∂Bi/∂yj) = (∂Ap/∂xq)(∂xp/∂yi)(∂xq/∂yj)
+ y{ij,m}(∂xp/∂ymAp)

In the second term on the right-hand side (∂xp/∂ym)Ap is just equal to Bm so the above formula reduces to

(∂Bi/∂yj) = (∂Ap/∂xq)(∂xp/∂yi)(∂xq/∂yj)
+ y{ij,m}Bmx{mq,p}(∂xm/∂yi)(∂xq/∂yj)Ap

Upon rearrangement this takes the form

(∂Bi/∂yj) − y{ij,m}Bm =
[(∂Ap/∂xq) − x{mq,p}(∂xm/∂yi)Ap](∂xp/∂yi)(∂xq/∂yj)

This is the transformation rule for a covariant tensor of rank two. Thus the quantity

∂Ai/∂xj − {ij,p}Ap

is a covariant tensor of rank two and is denoted as Ai, j. It is called the covariant derivative of a covariant vector.

Likewise the derivative of a contravariant vector Ai can be defined as

∂Ai/∂xj + {pj,i}Ap

and this is denoted as Ai, j.

(To be continued.)

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins