The Nature of the Principle of Least Action in Mechanics

San José State University

applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley USA

The Nature of the Principle of Least Action in Mechanics

Valid, quantitatively accurate mechanics began with Gallileo, but it was Isaac Newton who brought it into full development. Newtonian mechanics involved particles moving in reponse to forces. After Newton physicists sought to reformulate mechanics in terms minimization principles. These schemes were more elegant than Newtonian mechanics but, as is argued below, they are misleading.
There are in physics two principles of minimization. One is in optics, Fermat's Principle of Least Time, according to which a light ray travels from one point to another by the path that involves the least time. The other is in mechanics, Hamilton's Principle of Least Action. There is a quantity called action that can be computed for each path that a system can take in evolving from its initial state to its final state. According to Hamilton's principle, the path that the system takes is the one that involves the minimum value of action. Hamilton's principle has been modified to the condition that the path taken involves stationarity of action with respect to nearby paths. This may involve maximization or inflection points as well as minimization.
These principles need to be explained. How can a noncognizant system find the path of least time or least action? There is in mathematics what is known as Pontryagin's Maximum Principle which says that the path which maximizes a particular function over a time period is the one that maximize a related function at each inst ant. This related function must have some physical reality and the instant-by-instant maximization is the real explanation for how the systems evolves and it is only incidentally that the overall function on the interval of time is maximized.
In a way it is like the matter of electric and magnet field intensity. These observables can be derived from a vector potential function and a scalar potential function. They were initially thought to be just mathematical conveniences, but these potential functions appear to have some physical existence. See The Aharonov-Bohm Effect. In physics the related functions are momenta.
Pontryagin's Maximum Principle

Pontryagin's Maximum Principle applies to a particular type of problem called a Bolzano Problem. Most optimization problems can be put into the form of a Bolzano problem, but more about that later.
A Bolzano problem involves a number of state variables which can change over time where time t runs from 0 to T. Let us suppose the state variables are X₁(t), X₂(t), ..., X_n(t). We want to maximize
V(T) = c₁X₁(T) + c₂X₂(T) + ...+c_nX_n(T),

given that we start at the point X₁(0), X₂(0), ..., X_n(0), and where the coefficients c₁, c₂, ..., c_n are given and T is some definite finite time. We are given so-called steering functions for controlling the changes in the state variables; i.e.,
dX₁/dt = f₁(X₁, X₂, .., X_n, u₁, u₂, .., u_m)

dX₂/dt = f₂(X₁, X₂, .., X_n, u₁, u₂, .., u_m)

...........................................................

dX_n/dt = f_n(X₁, X₂, .., X_n, u₁, u₂, .., u_m)

where the variables u₁, u₂, ..., u_m are functions of time and are called the control variables. The objective is to choose the control variables at each instant of time so as to steer the state variables from their initial values
X₁(0), X₂(0), ..., X_n(0)

to some point

X₁(T), X₂(T), ..., X_n(T)

where V(T) = c₁X₁(T) + c₂X₂(T) + ...c_nX_n(T) is maximized.
This seems to be a very difficult task. Pontryagin's Maximum Principle provides a neat, systematic solution.
To implement Pontryagin's method one defines a Hamiltonian function
H = φ₁f₁ + φ₂f₂ + ... + φ_nf_n
= Σφ_if_i,

where the functions f_i for i=1 to n are steering functions defined above and the set of adjoint variables φ₁, φ₂, .., φ_n are such that
dφ_j/dt = −∂H/∂X_j
= −Σ_i φ_i(∂f_i/∂X_j)

and φ_i(T)= c_i for i=1, 2, .., n. Note that if H does not depend upon X_j then dφ_j/dt=0 for all t and thus φ_j(t) is a constant. In physics φ_j would be said to be conserved.
The optimum values of the control variables at time t are the ones that maximizes H.
This usually means that the optimum u_k(t) is such that
∂H/∂u_k(t) = 0
which means

Σ_iX_i(∂f_i/∂u_k(t)) = 0
for k=1, ..., m.

unless u_k is constrained, in which case the optimal u_k may be at a limit of its range.
The situation can be summed up as follows:

Instantaneous maximization
involving adjoint variables

<=>

Maximization over an
interval of time

The relationship between the interval optimization and instant-by-instant optimization works both ways, but the instant-by-instant evolution of a physical system is fundamental and it is the minimization of action which is derived.
The Fundamentality of the Instant-by-instant
Mechanics of Physical Systems

This proposition can be established by finding physical situations in which action is not minimized. Take for example an electron moving from point A toward point B which encounters midway a positron. The positron and electron are replaced by two gamma photons traveling at the speed of light in opposite directions. Momentum and energy are conserved in the annihilation, but action is not minimized. It is not even defined and that does not constitute any violation of normal physical behavior.
Derivation of Lagrangian Mechanics
from Pontryagin's Maximum Principle

There is no problem involved in using a maximization principle to solve a minimization problem. One simply maximizes the negative of the quantity to be minimized.
The typical physical system involves a set of state variables, q_i for i=1 to n, and their time derivatives. The difference between the kinetic energy and the potential energy of the system is called the Lagrangian of the system L where
L = L(q_i, …, q_n, (∂q_i/∂t), …, (∂q_n/∂t), t)

The action S for the system over the interval 0 to t is
S(t) = ∫₀^tLdt
and thus
(dS/dt) = L

This equation is the steering function for a (n+1)-th variable, but rather than label S variable (n+1) it is more convenient to label it the zeroeth variable. It is also convenient to let Q stand for the vector of the state variables q_i for i = 1 to n. The steering functions for the problem then are:
(dS/dt) = L(Q, dQ/dt, t)
(dQ/dt) = (dQ/dt)
(d²Q/dt²) = U

The adjoint variables for the q_i for i=0 to n are given by φ_i where
(dφ_i/dt) = (∂L/∂q_i)
for i = 0 to n

In physical situations the adjoint variables are the generalized momenta of the state variables.
The coefficients in the objective function are all zero except for c₀ which is equal to −1, since S is being minimized. Since φ₀(T)=c₀ and (dφ₀/dt) = 0 for all t, φ₀(t)=−1 for all t.
The quantities (d²q_i/dt²) are in the nature of accelerations and thus closely related to forces.
The Lagrangian L is K−V so
(∂L/∂q_i) = −(∂V/∂q_i)

but −(∂V/∂q_i) is just a force. The Euler-Lagrange equation for the minimization problem is
(∂L/∂q_i) = d((∂L/∂(dq_i/dt))/dt

But ∂L/∂(dq_i/dt) is the generalized momentum for the state variable q_i, which typically is a mass variable times the time rate of change of q_i; i.e., m_i(dq_i/dt)=m_iv_i.
Thus more specifically (d²q_i/dt²) is equal to a force divided by a mass; F_i/m_i. This may be appropriately called reduced force. The control variable u_i(t) should be chosen in the direction of the maximum reduced force F_i/m_i.
Illustration

Consider an inclined plane and a weighty object resting on it. The vertical force representing the weight of the object may be resolved into components, one of which is perpendicular to the plane. The component in the plane can point in many directions. The one that has the largest magnitude is the one that point in the direction of the downward gradient of the plane. That is the direction in which the object moves. That happens to be the direction of the greatest reduced force.

In the above graph the blue line represents the points of constant height. The gradient of the plane is perpendicular to the line of constant height.
Conclusion

A physical system moves in the path that minimizes least action because it moves at each instant according to criteria which results in it incidentally minimizing action. That instantaneous criterion can be represented in the system moving in the "direction" of maximum reduced force, force divided by mass. It is this instant-by-instant dynamics which is fundamental. Treating the Principle of Least Action as fundamental is misleading.

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins

Pontryagin's Maximum Principle

V(T) = c1X1(T) + c2X2(T) + ...+cnXn(T),

dX1/dt = f1(X1, X2, .., Xn, u1, u2, .., um) dX2/dt = f2(X1, X2, .., Xn, u1, u2, .., um) ........................................................... dXn/dt = fn(X1, X2, .., Xn, u1, u2, .., um)

X1(0), X2(0), ..., Xn(0)

X1(T), X2(T), ..., Xn(T)

H = φ1f1 + φ2f2 + ... + φnfn = Σφifi,

dφj/dt = −∂H/∂Xj = −Σi φi(∂fi/∂Xj)

∂H/∂uk(t) = 0 which means ΣiXi(∂fi/∂uk(t)) = 0 for k=1, ..., m.

Instantaneous maximization involving adjoint variables <=> Maximization over aninterval of time

The Fundamentality of the Instant-by-instant Mechanics of Physical Systems

Derivation of Lagrangian Mechanics from Pontryagin's Maximum Principle

L = L(qi, …, qn, (∂qi/∂t), …, (∂qn/∂t), t)

S(t) = ∫0tLdt and thus (dS/dt) = L

(dS/dt) = L(Q, dQ/dt, t) (dQ/dt) = (dQ/dt) (d²Q/dt²) = U

(dφi/dt) = (∂L/∂qi) for i = 0 to n

(∂L/∂qi) = −(∂V/∂qi)

(∂L/∂qi) = d((∂L/∂(dqi/dt))/dt

Illustration

Conclusion

V(T) = c₁X₁(T) + c₂X₂(T) + ...+c_nX_n(T),

dX₁/dt = f₁(X₁, X₂, .., X_n, u₁, u₂, .., u_m)

dX₂/dt = f₂(X₁, X₂, .., X_n, u₁, u₂, .., u_m)

...........................................................

dX_n/dt = f_n(X₁, X₂, .., X_n, u₁, u₂, .., u_m)

X₁(0), X₂(0), ..., X_n(0)

X₁(T), X₂(T), ..., X_n(T)

H = φ₁f₁ + φ₂f₂ + ... + φ_nf_n
= Σφ_if_i,

dφ_j/dt = −∂H/∂X_j
= −Σ_i φ_i(∂f_i/∂X_j)

∂H/∂u_k(t) = 0
which means

Σ_iX_i(∂f_i/∂u_k(t)) = 0
for k=1, ..., m.

Instantaneous maximization
involving adjoint variables

<=>

Maximization over an
interval of time

The Fundamentality of the Instant-by-instant
Mechanics of Physical Systems

Derivation of Lagrangian Mechanics
from Pontryagin's Maximum Principle

L = L(q_i, …, q_n, (∂q_i/∂t), …, (∂q_n/∂t), t)

S(t) = ∫₀^tLdt
and thus
(dS/dt) = L

(dS/dt) = L(Q, dQ/dt, t)
(dQ/dt) = (dQ/dt)
(d²Q/dt²) = U

(dφ_i/dt) = (∂L/∂q_i)
for i = 0 to n

(∂L/∂q_i) = −(∂V/∂q_i)

(∂L/∂q_i) = d((∂L/∂(dq_i/dt))/dt