Here is another version of proof of Maupertuis’s principle. This version is pure Hamiltonian and independent on the Lagrangian approach.



The proof is based upon the Hamiltonian version of the Vector Field Straightening Theorem. It seems that such a style of exposition simplifies understanding of this non-trivial construction.

First recall and briefly discuss the Hamiltonian Least Action Principle.

The Hamilton Least Action Principle.

Let ##(M,\omega)## be a symplectic manifold with local symplectic coordinates

##z=(x,y)\in M##,

$$x=(x^1,\ldots,x^m),\quad y=(y_1,\ldots,y_m),\quad \omega=dy_i\wedge dx^i.$$

A Hamiltonian system with Hamiltonian function ##H=H(t,z)## is given

$$\dot y=-\frac{\partial H}{\partial x},\quad \dot x=\frac{\partial H}{\partial y}.\qquad (1)$$

This system is also denoted as ##\dot z=v_H(t,z).##

Let ## M^e=\mathbb{R}\times M=\{u=(t,z)\}## stand for the extended phase space of system (1).

Introduce a set of smooth curves ##\Gamma=\{\gamma\};## those curves belong to the space ## M^e:\quad \gamma\subset M^e## and connect surfaces ##\{t=t_i,\quad x=\tilde x_i\}\subset M^e,\quad i=1,2##.

Consider the Action functional

$$\gamma\xrightarrow{\mathcal F}\int_\gamma\psi,\quad \psi=y_idx^i-Hdt$$ defined on the set ##\Gamma##.

Theorem 1.

1) Assume that ##\hat\gamma\in \Gamma## is a critical point of the functional ##\mathcal F## and this curve can be parametrized with the parameter ##t\in[t_1,t_2];\quad \hat\gamma=\{(t,\hat z(t))\}##. Then ##\hat z(t)## is a solution to system (1).

2) Assume that a function $$\hat z(t)=(\hat x,\hat y)(t),\quad x(t_i)=\tilde x_i,\quad i=1,2$$ is a solution to system (1). Then ##(t,\hat z(t))## is a parametric representation of some curve ##\hat\gamma\in \Gamma## which is a critical point of ##\mathcal F##.

The proof of this theorem is straightforward: one should just take a smooth family ##\gamma_\varepsilon=(t,z_\varepsilon(t))\in \Gamma,\quad (t,z_0(t))=\hat\gamma,\quad z_0=\hat z##

and calculate

$$\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}\mathcal F(\gamma_\varepsilon)=\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}\int_{t_1}^{t_2}\Big(y_i(\varepsilon,t)\dot x^i(\varepsilon,t)-H(t,z_\varepsilon(t))\Big)dt$$

$$=\int_{t_1}^{t_2}\Big(\frac{\partial y_i}{\partial\varepsilon}\Big(\dot x^i-\frac{\partial H}{\partial y_i}\Big)$$

$$-\frac{\partial x^i}{\partial\varepsilon}\Big(\dot y_i+\frac{\partial H}{\partial x^i}\Big)\Big)\Big|_{\varepsilon=0}dt=0.$$

Here we use integration by parts and the fact that

$$\frac{\partial x}{\partial \varepsilon}(\varepsilon,t_i)=0,\quad i=1,2.$$

The Maupertuis Principle.

Let the function ##H## be independent on ##t:\quad H=H(z).## Then ##H## is a first integral to system (1). And let

$$S_h=\{z\in M\mid H(z)=h\}$$ be a level surface of ##H##.

Let ##\Sigma_h## stand for a set of smooth curves ##\sigma\subset S_h## those connect manifolds

$$\{x=\tilde x_i\}\subset M.$$

The Reduced Action functional

$$\sigma\xrightarrow{\mathcal G}\int_\gamma\lambda,\quad \lambda=y_idx^i$$ is defined on the set ##\Sigma_h##.

Theorem 2.

1) Assume that ##\hat\sigma\in \Sigma_h## is a critical point of the functional ##\mathcal G## and the curve ##\hat\sigma## does not contain critical points of ##H##. Then there exists a parametrization ##t\mapsto \hat z(t)## of the curve ##\hat\sigma## such that ##\hat z(t)## is a solution to system (1).

2) Assume that a function $$\hat z(t)=(\hat x,\hat y)(t)\in S_h,\quad \hat x(t_i)=\tilde x_i,\quad i=1,2$$ is a solution to system (1). Then ##\hat z(t)## is a parametric representation of some curve ##\hat\sigma\in \Sigma_h## which is a critical point of ##\mathcal G##.



Proof of Theorem 2.



To prove item 1) recall the Vector Field Straightening Theorem:

If at a point ##z’\in M## the Hamiltonian is non-degenerate: ##dH(z’)

e 0## then in some open neighborhood ##U## of the point ##z’## there exist a local symplectic coordinates ##Z=(X,Y)## such that ##H=X^1+const.##

The proof of this fact see in P.J. Olver: Applications of Lie Groups to Differential Equations. Springer-Verlag, New York, 1989.

Observe also that since ##\omega=dY_i\wedge dX^i## and ##d\lambda=\omega## then in these new coordinates one has

$$\lambda=Y_idX^i+df(X,Y).$$

Suppose that the curve ##\hat\sigma## has its ends at points ##A,B## and we have already parametrized an arc of the curve from the point ##A## to a point ##C\in \hat\sigma## by means of parametrization ##\hat z(t),\quad \hat z(t_A)=A,\quad \hat z(t_C)=C,\quad t\in[t_A,t_C]## and ##\hat z(t)## is a solution to system (1).

Introduce the coordinates ##(X,Y)## (straight the vector field ##v_H##) in some open neighborhood ##U## of the point ##C##.

The manifold ##U\cap S_h## is determined as follows ##\{X^1=h-const\}##. So that

$$\lambda\Big|_{U\cap S_h}=\sum_{i=2}^mY_idX^i+df.$$

Let ## \hat Z(t)=(Q(t),P(t)),\quad t\in (t_C,t^*)## be a parametrization of ##\hat\sigma## from the point ##C,\quad Z(t_C)=C## to some other point of ##U##.

Consider the following perturbation $$Z_\varepsilon(t)= \hat Z(t)+\varepsilon w(t)\in U\cap S_h,\quad w(t)=(a,b)(t),\quad \mathrm{supp}\,w\subset (t_C,t^*) ,\quad a^1(t)=0$$ of the curve ##\hat \sigma##. This perturbation is denoted by ##\sigma_\varepsilon##.

So that after integration by parts we get

$$\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}\mathcal G(\sigma_\varepsilon)=\int_{t_C}^{t^*}\sum_{i=2}^m\big(b_i(t)\dot Q^i(t)-a^i(t)\dot P_i(t)\big)dt=0.\qquad(2)$$

This implies that the functions

## P_i(t), Q^i(t),\quad i=2,\ldots,m## are identical constants.

We also have ## Q^1(t)=h-const ##. The variable ##Y_1## is changed along the curve ##\hat\sigma## freely. Consequently if ##Y_1##-coordinate of the point ##C## is equal to ##Y_1^C## then we can put ##P_1(t)=Y_1^C+t_C-t##.

It is easy to see that the vector ##\hat Z(t)=(Q(t),P(t))## satisfies system (1) for ##t\in(t_C,t^*)##. The function ##\hat Z(t)## is continuous at the point ##t=t_C##. Moreover,

$$\lim_{t\to t_C+}\frac{d}{dt}\hat Z(t)=v_H(C)=\lim_{t\to t_C-}\frac{d}{dt}\hat z(t).$$

Then we straight the vector field ##v_H## in a neighborhood of the point ##\hat Z(t^*)## and so on.

Let us prove item 2). Let ##\hat z(t)=(\hat x,\hat y)(t)## be a solution to system (1). Then by theorem 1 , ##(t,\hat z(t))## is a parametric representation of some curve ##\hat\gamma\in \Gamma## that is a critical point of ##\mathcal F##, all the more the curve ##\hat\gamma\in \Gamma## is a critical point of ##\mathcal F## relative to perturbations belonging to ##\mathbb{R}\times S_h.##

Observe also that ##\psi\mid_{S_h}=\lambda-hdt## and the term ##hdt## is killed by the variation. Thus the curve ##\hat \sigma## is defined as a projection of ##\hat \gamma## from from ##\mathbb{R}\times S_h## onto ##S_h##.

The theorem is proved.