First, consider the Lagrangian method. Choose two definite positions (q and qo) each at definite times (t and to). A position at a definite moment in time (e.g., (q,t) or (qo,to)) is called a "spacetime event". Consider the set of all possible trajectories that connect these two fixed spacetime events. The actual physical path of the particle that passes through both events extremizes or is a"critical point" of the action integral. The calculus of variations of the possible trajectories with passing through the same two fixed events, on the positions (q) and velocities (dq/dt), then implies the second-order Euler-Lagrange equations which express Newton's second law of motion (F = ma). These Euler-Lagrange equations assert that the partial time derivative of the partial derivative of the Lagrangian relative to the speed component minus the partial derivative of the Lagrangian relative to the corresponding position component (for a given direction in space) exactly vanishes.
Second, consider the Hamiltonian method. Use the Legendre transformation to express the action integral in terms of the Hamiltonian. Now do the calculus of variations on the positions (q) and their canonically conjugate momenta (p). The latter replace the velocities. The result is twice as many first-order Hamiltonian equations of motion that express Newton's second law. The Hamiltonian equations assert, firstly, that the speed component is the partial derivative of the Hamiltonian with respect to the corresponding momentum component. Secondly, that the time derivative of the momentum component is the negative partial derivative of the Hamiltonian with respect to the corresponding position component.
Third, consider the Hamilton-Jacobi method. We start in the Hamiltonian picture and define finite "canonical transformations" as new positions (Q) and their canonical momenta (P) that are each functions of both old positions (q) and their canonical momenta (p) in such a way that the Hamiltonian equations retain the same form under the transformation.
The generic "generating function" (F) of the finite canonical transformation is then defined. This F is a new kind of field. It does not have objective significance in classical mechanics, but acquires objectivity in Bohm’s version of quantum mechanics. The canonically-transformed action integral differs from the original action integral only by the total time derivative (dF/dt) of this generating function. The generating function is an arbitrary function of both the old (q) and the new (Q) positions and old (p) and new (P) momenta as well as of a single universal time variable (t). However, the equations of the transformation imply that only half of the positions and momenta are independent variables. For example, we can have a generating function that depends only on the old (q) and new (Q) positions plus the time (t). We can have another one that depends only on the old positions (q) and the new momenta (P), and so on. The generating function connects two abstract frames of reference in the dynamical phase space of the classical particle system.
In particular, the motion of the non-relativistic classical system in time is the continuous unfolding of a particular kind of infinitesimal canonical transformation. This infinitesimal transformation (q -> q + dq, p -> p + dp) in time infinitesimal dt, is generated by the old Hamiltonian (H). The finite time (t) canonical transformation is determined from the condition that the new Hamiltonian (K) be exactly zero and its corresponding positions (Q) and momenta (P) are constant in time along the trajectory with vanishing time derivatives. This gives the Hamilton-Jacobi partial differential equation for the generating function (F = S) of this particular finite canonical transformation that describes the time evolution of the particle in a force field. S is a function of the old and new positions and the time.
The classical mechanical Hamilton-Jacobi equation says that the partial time derivative of (S) plus the old Hamiltonian (H) equals zero. The old momenta (p) inside the old Hamiltonian are partial derivatives of S with respect to the corresponding position components (q).
We have thus constructed a function, Hamilton’s principal function, which generates a canonical transformation to coordinates and momenta that are constant in time along a trajectory. p.33.Holland describes (p.34) a more general variation in which the change in the action path integral (I) is from a non-fixed initial (qo) position and non-fixed final position (q) and final time (t) but for a fixed initial time (to). The result is again the Hamilton-Jacobi equation from this more general perspective.
The S field function is not unique for a given actual trajectory. There is a kind of gauge freedom here.
The different functional forms of S connected with the same Hamiltonian imply different types of ensemble. p.36
... in quantum theory two S-functions are distinguished not only globally through the ensembles they generate but in a stronger sense: two particles that start with the same xo, po do not in general pursue the same trajectory in the given potential V. p.37
... while the aim of Jacobi’s method is the computation of a single orbit, S is actually connected with an infinite set of potential trajectories pursued by an ensemble of identical particles. ... The different functional forms of S connected with the same Hamiltonian imply different types of ensemble. p.36
... in quantum theory two S-functions are distinguished not only globally through the ensembles they generate but in a stronger sense: two particles that start with the same xo, po do not in general pursue the same trajectory in the given potential V. p.37
The problem of [classical] dynamics ... may be formulated in terms of [the Hamilton-Jacobi equation] determining the evolution of a field S(q,t). This function determines at each point and at each instant the momentum of a system that may be potentially placed there through the relation [momentum (p) equals partial derivative of S with respect to corresponding (q)] For one body the basic law of motion is [velocity vector = gradient of S divided by the particle mass (m)] The function S is thus connected with an ensemble of identical systems rather than a single orbit. It is in this way that the S-functions may be physically distinguished. For fixed qo, po all S-functions imply the same time development q(t). This reflects the fact that the state of a [classical] material system is completely exhausted by specifiying its position and momentum - the S-function plays no role in either defining the state or in determining the dynamics. p.40
The set of classical orbits moving in a given potential forms a single-valued congruence when represented in phase space, i.e., only one trajectory may pass through each phase space point. It is a common property of classical force fields that when mapped into configuration space the trajectory field is multivalued: at an instant t more than one orbit may pass through a point q. The degree to which this happens depends on the nature of the force and the particular ensemble chosen (i.e., on [the initial] So(q,t)) and is reflected in the value of S(q,t) (which may, for example, include square roots and hence possess different branches). p.40Note that the configuration space for two point particles has six dimensions for a common universal time. Configuration space is not the same as familiar physical space of three dimensions with two particles moving in it.
Remember, the dogma of the Copenhagen interpretation emphasized by Bohr is that it is meaningless, if not crackpot, to try to visualize a particle trajectory at the quantum level. However, with today’s computers such trajectories have been simulated in simple cases like a single electron passing through two slits without measuring through which slit the electron passes.
Holland gives a deep discussion of the connection of Hamilton-Jacobi theory to Liouville equation for the time evolution of the probability distribution in phase space of classical statistical mechanics including the role of classical chaos which renders deterministic systems effectively unpredictable because tiny changes in initial conditions lead to large differences in trajectories after short times.