San José State University 

appletmagic.com Thayer Watkins Silicon Valley & Tornado Alley USA 


This is an introduction to the concepts and procedures of tensor analysis. It makes use of the more familiar methods and notation of matrices to make this introduction. First it is worthwhile to review the concept of a vector space and the space of linear functionals on a vector space.
A vector space is a set of elements V and a number of associated operations. There is an addition operation defined such that for any two elements u and v in V there is an element w=u+v. Furthermore there is an element of V, call it the zero vector 0, such that for any element u of V, u+0=u. And for any element u of V there is an element of V, say v, such that u+v=0. The element v is called the additive inverse of u and is denoted as −u. There is a set of scalars Λ for a vector space such that for any element v of V and any element λ of Λ there is an element of V, denoted as λv and called the scalar product of λ and v. Usually the set of scalars is the real numbers or the complex numbers.
For a vector space there is set of elements of V, called a basis, such that any vector in V can be reprsented as a linear combination of the basis elements. The linear combination is given by the scalar coefficients for the basis elements. Thus any vector in V can be represented by an ordered array of scalar elements, which are called the components of the vector. Such an order array can be considered a column vector of the scalar elements. The basis of a vector space may or may not be finite.
Linear functionals on the vector space are functions which map the elements of V into the set of scalars Λ. Addition and scalar multiplication on the set of linear functionals are defined and likewise a zero element and additive inverses. Thus the linear functionals over the set of vectors V form a vector space, called the dual space to the vector space for V.
If the vector space for V is of finite dimension n then so is its dual space and thus the dual space has the same structure as that for V. If the vector space for V is not of finite dimensions then the dual space not necessarily of the same structure.
For the vector space of ndimensional column vectors the dual space may be thought of as also ndimensional vectors. Then the linear functionals may be considered as matrix products; i.e, if X is an element of V and F is an element of the dual space then the linear functional is the matrix product of the transpose of F wth X; i.e.,
Now consider a change in the coordinate system for V such that the vector of components X gets changed into the vector of components Y by multiplication by an invertible matrix M; i.e.,
The linear functional functional z = F^{T}X gets transformed to z = G^{T}Y. Since Y=MX and X=M^{1}Y, therefore
The dual vectors get transformed by the inverse of the transpose of the transformation of the vectors. Thus there are some vectors which get transformed by one rule and other vectors which get transformed by an associated alternate rule. A similar arrangement occurs in tensor analysis in which some tensors are called covariant and transform according to one rule and others are called contravarianat and transform according to an alternate but related rule.
It is very tempting to identify the vectors of the dual space as row vectors. This would illustrate how the dual space could have the same mathematical structure as the primal (original) space yet be distinct. However there is no provision in tensor analysis for identifying one vector as a column vector and another one as a row vector.
A tensor is an entity which is represented in any coordinate system by an array of numbers called its components. The components change from coordiate system to coordinate in a systematic way described by rules. The arrays of numbers are not the tensor; they are only the representation of the tensor in a particular coordinate system. A vector is a special case of a tensor. A vector is an entity which has direction and magnitude and is represented by a one dimensional array of numbers. Unfortunately it is common to consider any one dimensional array of numbers as a vector.
A more general notion of a vector involves not only its direction and magnitude but also its point of application. Thus a vector can be represented by two points; one point representing its point of application and the line between the points giving its direction and magnitude. The points in ndimensional space can be considered invariant but their representations in a coordinate system are given by two ndimensional arrays of numbers, say X_{1} and X_{2}. If the coordinate system is changed by shifting the origin (translation), rotating the axes and/or stretching the axes (dilation) then the components of the representation of the points change. The translation of the origin is given by an ndimensional array, say A and the rotation and dilation by a matrix, say B. The new representations of a point P is Y where
where X is the representation in the old coordinate system.
The allowable transformations are the ones which have inverses; i.e.,
The conditions for the existence of an inverse transformation can be stated in terms of the Jacobian determinant for the transformation. Let J_{i,j}=∂y_{i}/∂x_{j} and let J be the matrix formed from the J_{ij}'s. The condition for the existence of an inverse transformation within some region R is that det(J)≠0 within that region.
If the xcoordinates are transformed to ycoordinates and the ycoordinates are then transformed to zcoordinates then matrix for the transformation of the xcoordinates to the zcoordinates is just the product of the matrices for the two intermediate transformations; i.e., if J is the matrix for the transformation from the x's to the y's and K is the matrix of the transformation from the y's to the z's then the matrix of the transformation from the x's to the z's is JK. Thus the Jacobian for the transformation from the x's to the z's is det(J)det(K) and hence if det(J) and det(K) are both nonzero in some region R then so is det(JK). Thus if both of the intermediate transformations are allowable then so is their composition.
There is the trivial transformation y_{i}=x_{i}, called the identity transformation. The Jacobian of this transformation is equal to unity.
Since the composition of a transformation with its inverse is just the identity transformation it follows that product of the Jacobian of a transformation with the Jacobian of the inverse transformation is equal to unity and hence det(J^{−1})=det(J)^{−1}.
All of this adds up to the mathematically interesting proposition that the set of allowable transformations of coordinate systems form a group.
The transformation from cylindrical coordinates to Euclidean coordinates will be used for the illustration. First these coordinates will be given in their more familiar form and then converted to the subscripted notation. The cylindrical coordinates are (r, θ, z) and the Euclidean coordinates are (x, y, z) where
In subscripted notation then
Thus the transformation is
Therefore the elements of the Jacobian matrix J_{i,j} = ∂y_{i}/∂x_{j} are
The Jacobian determinant reduces to x_{1}[cos²(x_{2}) + sin²(x_{2}] and hence J=x_{1}.
In the more familiar notation the Jacobian matrix is
and the Jacobian of the transformation is r.
The Jacobian matrix of the inverse transformation, the inverse of the above matrix, is then
Its determinant is 1/r, which of course is the reciprocal of the determinant of the Jacobian matrix of the original transformation.
(To be continued.)
Let X be a representation of a vector in ndimensional space and let h=f(X) be a scalar function of X. If the X representation is transformed to another coordinate system by the transformation T(), such that Y=T(X), where T is a vectorvalued function then the scalar function f(X) gets transformed into g(Y) where g(Y)=f(T^{1}(Y)). For simplicity let inverse transformation T^{1}(Y) be represented as S(Y). Thus g(Y)=f(S(Y)).
Consider the vector whose components are {∂f/∂x^{i}, i=1,…,n}. This is called the gradient of the scalar function f. Now consider the gradient of the scalar function g(Y); i.e., {∂g/∂y^{j}, j=1,…,n}. How are the components of the gradient of g related to the componets of the gradient of f? Calculus gives the answer as
Summations arise almost always when there is a repeated index such as i in the above equation. Albert Einstein promoted the convention that since a repeated index almost always means summation over that index the summantion sign Σ can be dispensed with. Therefore in tensor notation the above equation is
Let (∂x^{i}/∂y^{j}) be denoted as the element of a matrix M and let (∂f/∂x^{i}) and (∂f/∂y^{j}) be the components of the vectors ∇_{x}f and ∇_{y}f, respectively. Then in matrics notation the above transformation of the gradient vector is represented as
Any entity whose components transform in this way is called a covariant tensor of rank 1. There are other entities which transform in a similar but different way. Consider the differentials dy^{j} and dx^{i} for i and j ranging from 1 to n. From calculus
The quantities (∂y^{j}/∂x^{i}) are the elements of what previously had been referred to as the Jacobian matrix of the transformation. Denoting the Jacobian matrix as J then the transformation of the vector of infinitesimals dX to dY is represented as
This transformation is called contravariant and a vector transforming in this way would be called a contravariant tensor of rank 1.
What was denoted above as the matrix M is the Jacobian matrix of the inverse transformation. The covariant transformation of the gradient vectors expressed previously could be represented as
These two rules for transformation, covariant and constravariant, are the defining characteristic of tensors. They can be extended to multiple indices. The indices for covariance are conventionally denoted as subscripts and contravariance as superscripts.
One scalar quantity of importance in geometry is distance. In Euclidean coordinates for an ndimensional space the formula for the length ds of an infinitesimal line segment is
If dX represents the column vector of dx^{i}'s then the formula for the line length could be expressed as
If the coordinate system is changed from the Euclidean X's to some coordinate system of Y's then
In matrix notation this may be expressed as
where M is the matrix whose m_{i,j} element is (∂x^{i}/∂y^{j}). Thus (dX)^{T}=(dY)^{T}(M)^{T} and hence
where the matrix G is (M)^{T}M. The elements of G are given by
The two dimensional array of the g_{i,k}'s is called the metric tensor. That it is in fact a tensor and a covariant one at that is something that needs to be proven. It is the metric tensor for the coordinate system Y. The metric tensor for the Euclidean coordinate system is such that g_{i,k}=δ_{i,k}, where δ_{i,k}=0 if i≠j and =1 if i=k.
For cylindrical coordinates the inverse transform of Euclidean to cylindrical, i.e., cylindrical to Euclidean, is in familiar notation,
Thus the matrix M is given by
Thus the matix M is
and its transpose is
Thus the metric matrix G=M^{T}M is
The length of an infinitesimal line elements is then
The inverse of the metric tensor is significant. In the case of the cylindrical coordinate system this inverse G^{1} is
In tensor analysis the metric tensor is denoted as g_{i,j} and its inverse is denoted as g^{i,j}. This latter notation suggest that the inverse has something to do with contravariance. For a column vector X in the Euclidean coordinate system its components in another coordinate system are given by Y=MX. Now consider G^{1}X. Since G=M^{T}M,
Thus the transformation of G^{1}X would be given by
This is the transformation rule for a contravariant tensor. It is also the rule for what in the Introduction was referred to as a vector of the dual space. Thus multiplication of a covariant tensor by the contravariant metric tensor creates a contravariant tensor. The operation also works in the other direction. Multiplication of a contravariant tensor by the metric tensor produces a covariant tensor.
Suppose the transformation of the coordinate system is such that Y=MX, then the linear functional H in the new coordinate system corresponding to F in the original coordinate is given by
Now consider GF where G=M^{T}M. Then
This is the transformation rule for a covariant tensor. Thus multiplication of a covariant tensor by the inverse metric tensor produces a contravariant tensor.
The Christoffel 3index symbol of the first kind is defined as
A property of these symbols obvious from the definition is that
When the Chiristoffel symbols of the first kind are multiplied by the elements of the inverse of the metric tensor and results summed over the third index a different set of symbols are generated which are called the Christoffel 3index symbols of the second kind. There are different notations used for these symbols, but the typographically most convenient is, in analogy with the symbols of the first kind, {ij,k}. I. S. Sokolninikoff in his Tensor Analysis uses a two level symbol impossible to create in HTML. The Princeton school uses a symbol of the form Γ_{ij}^{k} but the superscripts and subscripts cannot be properly lined up using HTML. Therefore {ij,k} will be used here. The definition is
It follows from the symmetry of [ij,k] with respect to the interchange of the first two indices that
The Christoffel symbols are essential in defining the derivative of tensors. However the Christoffel symbols themselves do not represent a tensor. That is to say they do not transform upon a change in coordinate system according to either the covariant or contravariant rules of transformation.
Previously it was noted that the gradient vector for a scalar function is a covariant tensor. The gradient is the set of partial derivatives of a scalar function, say f(x^{1}, …, x^{n}), i.e., (∂f/∂x^{i}) for i=1 to n.
Consider now the second derivatives of the scalar function
When the coordinate system is changed from X's to Y's the gradient components transform according to the tensor formula
However the formula for the transformation of the second derivatives is
If there were only the first term on the righthand side then the transformation would be tensorial but the second term spoils it.
E.B. Christoffel derived an important formula concerning the second derivatives in a coordinate transformation, namely
where _{x}{ij,k} and _{y}{ij,k} stand for the Christoffel symbols of the second kind in the X coordinate system and the Y coordinate system, respectively.
The formula for the transformation of any covariant vector A in X coordinates to vector B in the Y coordinates is
When the Christoffel formula, with appropriate indices, is substituted for the second term on the right the result is
In the second term on the righthand side (∂x^{p}/∂y^{m})A_{p} is just equal to B_{m} so the above formula reduces to
Upon rearrangement this takes the form
This is the transformation rule for a covariant tensor of rank two. Thus the quantity
is a covariant tensor of rank two and is denoted as A_{i, j}. It is called the covariant derivative of a covariant vector.
Likewise the derivative of a contravariant vector A^{i} can be defined as
and this is denoted as A^{i}_{, j}.
(To be continued.)
HOME PAGE OF Thayer Watkins 