Classical Mechanics from Zero, in Two Languages
Introduction
Everything you have ever watched move did so in a world you could point to. A thrown stone traces an arc, a spun top holds its axis and then wanders, a door swings on its hinge, a gyroscope resists the hand that tries to tilt it. Each of these is a story about position and about orientation, about where a body is and about which way it faces, told over time. Classical mechanics is the discipline that makes those stories precise, and its first act, long before any force is named, is to fix a stage on which position and orientation can be spoken of at all. That stage is a reference frame: a choice of a point to call the origin and a set of directions against which every other direction is measured. If the frame is inertial, meaning that a body left alone moves in a straight line at constant speed within it, then the frame does something quietly decisive. It lets us stop distinguishing a place from the arrow that reaches it. A point of space and the vector that runs from the origin to become two names for one thing, and the whole of mechanics can be written in the language of vectors.
The trouble is that there is more than one such language, and the choice between them is not cosmetic. This article develops two of them side by side and keeps them in step from the first definition to the last. The first is the language most readers already carry, the matrix and linear algebra in which a vector is a column of numbers, a rotation is an orthogonal matrix, and an angular velocity is a vector fished out of an antisymmetric matrix by a rule one memorises. The second is geometric algebra, the Clifford-algebra language in which the product of two vectors is a single object that carries both their common projection and their oriented plane, in which a rotation is generated by that plane directly, and in which the pseudo-quantities of the first language become honest elements of the algebra. The two describe the same mechanics. Holding them together is what makes the construction legible, for the matrix language keeps every formula computable, and the geometric language keeps every formula meaningful.
The plan is to build from the ground up. Part I fixes the algebra of a single instant: the inertial arena and the identification , then the vector space that houses the arrows, the inner product that measures them, and the linear maps that move them, each stated first in matrix terms. With that in place we press on the matrix language until its seams show, and answer with the geometric product, the graded algebra it generates, and the reflections and rotors that live there. Part II does the same for the calculus, replacing the grad, divergence, and curl of vector analysis, together with their separate integral theorems, by one vector derivative and one fundamental theorem. Part III turns the machinery on physics proper: Newton's laws, angular momentum as an oriented plane, rotating frames, and the rigid body whose orientation is carried by a rotor that never locks. Part IV lifts all of this to the Lagrangian and Hamiltonian formalisms, in both tongues, and closes by tallying what the second language bought.
Three threads run the length of the article, and it is worth naming them now because each returns as a promise kept. The first is a set of discomforts of the matrix language that motivate the whole enterprise: a rotation of the plane has no real eigenvalue, the axis of a spatial rotation is an accident of three dimensions, the cross product is a pseudovector that reflects with the wrong sign and trades an oriented plane for a normal vector only in three dimensions, and the determinant and the very notion of orientation sit outside the algebra of vectors rather than inside it. The second thread is the answer to each, delivered by the geometric product and the objects it makes. The third thread is a single construction, specialised once. We build the geometric algebra of a general finite-dimensional real inner-product space one time and in full, then read off from it the algebra of physical space, the algebra in which the rest of the article works. The facts special to three dimensions, the cross product's identification with a normal vector and the bivector-to-vector Hodge dual chief among them, are flagged as the dimensional accidents they are, so that the general construction and its special case are never confused.
The reader is assumed to have met vectors, partial derivatives, and a little linear algebra, the multiplication of a matrix by a column and the idea of a basis. No prior acquaintance with geometric algebra is presumed, and none with the abstract theory of manifolds, which we deliberately do without: the inertial frame is exactly what lets us stay among honest vectors in throughout. Every symbol is defined where it first appears, and every step is shown. We begin with the stage itself.
Part I
1. The two languages
1.1. The inertial arena
Before any vector can be written, the space of positions has to be pinned down. Physical space, as experienced, has no distinguished point and no distinguished directions; a stone falls the same way in Edinburgh and in Quito. The correct model for a space of positions with no preferred origin is an affine space, and the act of choosing an origin is what turns it into the vector space we compute in.
An affine space modelled on a real vector space is a set of points together with a map that assigns to each ordered pair of points a vector , subject to two demands: for any point and any vector there is exactly one point with , and for any three points the displacements chain,
The space carries points and their differences, and nothing else; there is no origin and no way to add two points.
An affine space knows about differences of positions but not about positions themselves, which is faithful to the physics, since only displacements, velocities, and separations are measurable, never an absolute location. What the physicist supplies, by an act of choice, is a reference point.
An inertial frame is a choice of a point , the origin, in an affine space whose modelling vector space carries the inner product of Section 1.3, such that a body subject to no force has tracing a straight line at constant rate. Relative to , every point acquires a unique position vector
and the assignment is a bijection from points to vectors. We write and from here on name points by their position vectors.
The identification of [eq:position-vector] is the single move that lets the whole article proceed without the machinery of manifolds. A point is now a vector, a trajectory is now a vector-valued function , and the velocity and acceleration it carries are vectors in the same space as the positions, as we make precise at once. The chaining rule of [eq:affine-chain] becomes the statement that displacement from to is the difference of their position vectors, which is the only sense in which positions may be subtracted.
A moving body is exactly such a vector-valued function of time, so the arena already lets us say how it moves before is axiomatised in Section 1.2 and long before the geometric calculus of Part II. All that is needed is the derivative of a single-parameter function, the limit of a difference quotient of vectors, which asks nothing of the spatial vector derivative to come.
The motion of a body is a trajectory, a vector-valued function assigning to each instant its position vector . Its velocity and acceleration are the first and second time derivatives, each the limit of a difference quotient of vectors,
They are the arrows tangent to the path, vectors in the same space as the position, and the overdot abbreviates .
The velocity and acceleration inhabit the same as the position, so the vector space that houses them is the one Section 1.2 axiomatises, and the length that turns a velocity into a speed is the inner product of Section 1.3. Position, velocity, and acceleration at one instant are naturally read as a single object, the jet of the trajectory there.
The k-jet of a trajectory at time is the tuple of its value and first derivatives there,
The 2-jet packages position, velocity, and acceleration as one datum, and it is exactly the data a law of motion will constrain: the second law of Section 3.2 fixes the acceleration entry from the position and closes the 2-jet into a trajectory.
These derivatives are limits, and every later estimate in the article rests on reading them as approximations with a controlled error. The derivative at an instant is the best linear approximation to the trajectory there, a bounded linear map whose size is measured by an operator norm, and that single number controls both the Landau remainder of a first-order expansion and the Lipschitz growth of the displacement. Stating this once fixes the footing for every approximation to follow.
The derivative is the best linear approximation to the trajectory at , the unique linear map for which the increment splits into a linear part and a remainder that dies faster than ,
As a bounded linear map its size is the operator norm , the speed itself. When the velocity is continuous on , the increment obeys the mean-value estimate
so a bound on the speed is a Lipschitz bound on the displacement.
The operator norm of [eq:increment-bound] is the derivative's size in , so the estimates that ground the calculations depend, like the velocity and acceleration themselves, on having a length at all. Everything now rests on the vector space : its axioms, fixed in Section 1.2, house the arrows, and the inner product of Section 1.3 gives them the length that turns a velocity into a speed and the operator norm into a number. The full geometric calculus that differentiates fields as well as trajectories, and the single fundamental theorem that governs it, come in Part II. We axiomatise next.
1.2. The vector space
The position vectors of Section 1.1 live in a set on which two operations are defined, the addition of two vectors and the scaling of a vector by a real number. This section states the axioms those operations obey, fixes the model we compute in, and pins the two structural invariants, dimension and basis, that everything downstream refers to.
A real vector space is a set with an addition and a scalar multiplication such that addition is associative and commutative with an identity element and inverses, and the two operations interlock through
for all scalars and all vectors . The elements of are vectors and the elements of are scalars.
These axioms say only that vectors may be added and scaled coherently; they say nothing yet of length or angle, which arrive in Section 1.3. A finite set of vectors is linearly independent when the only way to combine them to the zero vector is with all coefficients zero, and a basis is a linearly independent set whose linear combinations exhaust . Every basis of a given has the same number of elements, and that number is an invariant of the space.
A basis of is a set that is linearly independent and spans , so that every has a unique expansion
The number , common to all bases, is the dimension , and the real numbers are the components of in that basis. We write the component index as a superscript.
Fixing a basis turns each abstract vector into the column of its components, and this is the concrete model in which the matrix language computes.
Once a basis is chosen, the map sending each vector of to its component tuple is a bijection onto that respects addition and scaling. We display the tuple as a column vector
and identify with through it. For the physics of the later parts throughout, though the algebra of this part is stated for general .
The vector is one object, while the column of [eq:column-vector] depends on the basis chosen; a different basis gives the same arrow a different column. This distinction, between an object and its basis-dependent representation, is the seam that Section 1.4 exploits for maps and that Section 1.5 presses on until the matrix language complains. What is still missing from is any notion of length or angle, and we supply it now.
1.3. The inner product
The vector space of Section 1.2 can add and scale arrows but cannot yet say that one is longer than another or that two meet at a right angle. Mechanics needs both, since kinetic energy is built from speed and work from the angle between force and displacement. The object that supplies length and angle at once is a single symmetric, positive bilinear form.
An inner product on a real vector space is a map that is
and positive-definite, meaning for every . We write for and call the pair a Euclidean space.
From the single form both metric notions follow. The norm or length of a vector is , well defined because positive-definiteness makes the radicand non-negative. The angle between two nonzero vectors is fixed by
the fraction lying in by the Cauchy-Schwarz inequality, and two vectors are orthogonal when . In components, writing for the matrix of the form in a basis, the inner product is , summed over repeated indices.
A basis is at its most convenient when it is orthonormal, so that the matrix becomes the identity and components read off directly as inner products.
A basis of is orthonormal when
that is, each basis vector has unit length and distinct basis vectors are orthogonal. In such a basis and .
Every Euclidean space admits an orthonormal basis, and one is manufactured from any basis at all by the Gram-Schmidt process: keep the first vector, and from each later vector subtract its projections onto the directions already fixed, then normalise. We record the process and then run it in full.
Given any basis of a Euclidean space, define recursively
Then is an orthonormal basis, and each partial set spans the same subspace as .
Work in with the standard inner product , and orthonormalise the basis
taking one vector at a time through [eq:gram-schmidt].
Start with , whose squared length is , so .
Next remove from its component along . The overlap is , so
Its squared length is , giving after clearing the half.
Finally remove from its components along both. The overlaps are and , so
with squared length , so .
To check, we confirm orthonormality of [eq:orthonormal] on the mixed pairs: , , and , while , , and are each . The three vectors form the orthonormal basis [eq:gram-schmidt] promises.
The inner product completes the Euclidean space that the whole article inhabits. It is also, quietly, the seed of the second language, for the geometric product of Section 1.6 is built on the demand that a vector times itself return exactly the squared length of this section. Before that, we give the maps between vector spaces their matrix form.
1.4. Linear maps and their matrix representation
Mechanics is largely the study of how vectors transform: how a rotation carries one orientation to another, how the inertia of a body turns its angular velocity into its angular momentum. The transformations that respect the vector-space structure are the linear maps, and once a basis is fixed each acquires a matrix. This section defines both and records how the matrix changes when the basis does, the fact that Section 1.5 turns into a complaint.
A linear map between real vector spaces is a function that respects addition and scaling,
When the map is a linear operator on . A linear map is determined by its values on a basis, since follows from [eq:linear-map].
Because a linear map is fixed by what it does to a basis, and each image is itself a vector expandable in the basis, the whole map is captured by an array of numbers.
Let be a linear operator on and a basis. Expanding each image in the basis,
defines the matrix of in that basis, whose -th column is the component tuple of . The action of on a vector becomes matrix-times-column, , and the composite of two operators has matrix the product .
The operator is one object; its matrix is a shadow cast by the basis, exactly as the column of [eq:column-vector] is the shadow of the vector . Choose another basis and the same operator shows a different matrix, related to the first by conjugation.
Let and be two bases, related by the invertible change-of-basis matrix whose columns give the new basis vectors in the old, . Then the matrices of a fixed operator in the two bases are conjugate,
Quantities unchanged by such conjugation, the determinant and the trace among them, are properties of itself rather than of any basis.
The conjugation law of [eq:conjugation] is the precise statement that a matrix is a representation, and it draws the line between what belongs to the operator and what belongs to the observer's choice of frame. The determinant, the trace, and the eigenvalues survive the change of basis and so mean something physical; the individual entries do not. With linear maps in matrix form, the matrix language is complete, and we can now press on it until it strains.
1.5. The discomforts of matrix theory
The matrix language of Sections 1.1 through 1.4 is complete and computable, and it will carry us through the whole of mechanics if we let it. This section is the first thesis anchor of the article: a catalogue of the places where that language, though correct, describes the geometry through a veil. Each discomfort listed here is answered, in the same order, by the geometric product and the objects it makes, and the reader should hold each as a promissory note to be redeemed in Sections 1.6 through 1.11.
A rotation of the plane has no real eigenvalue. An eigenvector of an operator is a direction the operator merely stretches, and it is the most basis-independent thing a matrix offers. Yet the rotation of the Euclidean plane through an angle ,
has characteristic roots , which are not real unless is a multiple of . The most elementary rotation there is turns no real direction into a multiple of itself, so its own eigen-analysis leaves the real plane to compute in the complex numbers. The rotation is a real, tangible motion, and the appearance of is an artefact of asking a matrix for eigenvectors it geometrically declines to have.
The axis of a spatial rotation is an accident of odd dimension. In three dimensions a rotation does fix a real direction, its axis, because a real orthogonal matrix of determinant has characteristic polynomial of odd degree and so a real root, which is forced to be . The eigenvector for that root is the axis. This is comforting until one notices that the argument uses the oddness of the dimension and nothing else. In the plane there is no axis, in four dimensions a rotation can turn every direction at once with no axis anywhere, and the familiar picture of a rotation as a spinning about a line is a coincidence of the number three. The plane in which the turning happens exists in every dimension; the axis perpendicular to it does not.
The cross product is a three-dimensional pseudovector that reflects with the wrong sign. The operation used everywhere in mechanics to make an axis of rotation, a torque, or an angular momentum is the cross product . The vector it returns is a stand-in for an oriented plane, and that stand-in is available in three dimensions alone, because only there does the space of oriented planes have the same dimension, three, as the space of vectors, so that a plane can be traded for the vector normal to it. That trade hides a choice of handedness, and it takes revenge under reflection. A genuine vector, a velocity say, reverses when the world is reflected in a mirror; the cross product of two reflected vectors comes back unreversed, because the two sign flips cancel and the buried handedness supplies a third. An object that transforms like a vector under rotations but fails to flip under reflection is a pseudovector, and torque and angular velocity are pseudovectors living an awkward double life. The figure makes the failure visible.
The plane swept by and reflects honestly; the arrow that stands in for it flips the wrong way. The oriented plane is the invariant object, and Section 1.6 gives it a name, the bivector, that never needs the disguise.
Orientation and the determinant are bolted on from outside. Whether a basis is right- or left-handed, and whether a linear map preserves or reverses that handedness, is measured by the sign of the determinant of [eq:conjugation]. The determinant is a genuine invariant, yet in the matrix language it arrives as a separate formula, an alternating sum over permutations, imported to report on volume and orientation. It is not the product of anything within the algebra of vectors; there is no operation on vectors whose output is the oriented volume they span. Orientation sits outside the vector calculus, consulted when needed and otherwise set aside.
There is no closed product of vectors. Underlying all of the above is one absence. The real numbers may be added and multiplied, and the product of two numbers is a number. Vectors may be added, and scaled, and fed to an inner product that returns a scalar or to a cross product that in three dimensions returns a pseudovector, but there is no associative multiplication of two vectors that yields an object of the same kind, closing the operation the way multiplication closes on numbers. The inner product forgets all direction and keeps a scalar; the cross product keeps a direction, but only by trading an oriented plane for its normal, a trade available in three dimensions alone. Neither is a product on which an algebra can be built. Every discomfort in this list is a symptom of that single lack, and the remedy is a single new product, defined next.
1.6. The geometric product
The remedy to every discomfort of Section 1.5 is a single associative product on vectors that closes: it multiplies two elements of one algebra and returns an element of the same algebra, it contains the inner product as a part, and it generates the oriented planes that the cross product could only impersonate. We demand only one thing of it, that a vector times itself return its squared length, and everything else in this part is read off from that demand.[1][2]
On the Euclidean space of Section 1.3, the geometric product is an associative, bilinear, unital product written by juxtaposition , subject to the single contraction axiom that every vector squares to its inner-product norm,
a scalar, for every . The algebra generated by under this product is the geometric algebra of , developed abstractly in Section 1.7 as .
The product is bilinear and associative like matrix multiplication, but no commutativity is assumed, and that omission is where the geometry hides. Polarise the axiom of [eq:contraction] by applying it to a sum and expanding without ever reordering factors,
The left side is by the axiom and the bilinearity of , while the outer terms on the right are and . Cancelling them leaves the symmetric part of the product pinned to the inner product,
The symmetric, or commuting, part of the geometric product is exactly the inner product of Section 1.3. What of the antisymmetric part? Give it its own name.
For vectors , the inner product and the outer product are the symmetric and antisymmetric parts of their geometric product,
so that the geometric product decomposes as
The outer product is a new kind of object, a bivector, and it is antisymmetric, , so that .
The decomposition of [eq:geometric-decomp] is the keystone of the whole article. A single product of two vectors carries two pieces of information at once: the scalar that measures how much the vectors align, and the bivector that measures the oriented plane they span, with magnitude , the area of the parallelogram they subtend, and an orientation, the sense of circulation from to . The bivector is the invariant that Section 1.5 promised in place of the cross product: an oriented plane element, defined in every dimension, that reflects honestly because it is built from the vectors themselves and hides no handedness. The figure shows both parts moving together as and sweep.
In the grade language of Section 1.7 these two pieces are the grade- and grade- parts of the product, written and , so the decomposition of [eq:geometric-decomp] reads
This is the grade--times-grade- instance of a single associative product defined on multivectors of every grade, which Section 1.7 builds in general; the vector product here is its lowest case.
Two special cases fix the intuition. When and are parallel their outer product vanishes and the geometric product is the pure scalar ; the contraction axiom of [eq:contraction] is the case . When they are orthogonal their inner product vanishes and the geometric product is the pure bivector , which then anticommutes, . Orthogonal vectors anticommute and parallel vectors commute, and the general product of [eq:geometric-decomp] interpolates between the two by the angle. This one product, closed and associative, is the algebra the next section builds out to all grades.
1.7. Grades, blades, and the universal property
The outer product of Section 1.6 takes two vectors to a bivector; nothing stops us wedging a third. This section extends the outer product to all degrees, names the graded family it produces, builds the grade projection and the general geometric product that acts on multivectors of every grade, and then gives the geometric algebra its rigorous definition through a universal property, from which its dimension follows. The construction is done once, in general dimension, and specialised to physical space in Section 1.8, never rebuilt.
The outer product extends to any number of vectors as the fully antisymmetric part of their geometric product. A product of vectors that happen to be mutually orthogonal is already fully antisymmetric, so we take that as the model.
The outer product of vectors is the totally antisymmetric part of their geometric product; it vanishes when the vectors are linearly dependent and otherwise represents the oriented -dimensional volume element they span. Such an object is a blade of grade , or a -blade. Grade is the scalars, grade the vectors, grade the bivectors, grade the trivectors, and so on. A multivector is the general element of the algebra, a sum of blades of various grades. When every term of such a sum shares one grade the multivector is homogeneous of grade , a -vector; a -blade is the special -vector that factors as a single outer product of independent vectors, and a general -vector is a sum of these. The -vectors form a linear subspace of dimension , one basis blade for each choice of of the basis directions.
Because a multivector splits uniquely into homogeneous pieces, the algebra carries an operator that reads each piece off.
The grade- projection is the linear operator returning the grade- part of a multivector : it keeps the grade- blades in and annihilates every other grade. Each multivector is the sum of its projections,
with its scalar part, its vector part, and so on to the top grade . The projections are linear and idempotent, , and a multivector is homogeneous of grade exactly when .
The blades of all grades from an orthonormal basis, taken as ordered products of distinct basis vectors, are linearly independent and span everything the geometric product can build. Choosing of the basis vectors gives a grade- basis blade, and there are such choices, so the whole algebra has dimension
The alternating part of the algebra, the span of the outer products alone, is the exterior algebra of the same dimension ; the geometric algebra contains it as a vector space and adds the metric information through the inner part of [eq:geometric-decomp].
The geometric product now acts on all of this. It is fixed on vectors by the contraction axiom of [eq:contraction], and because it is associative and bilinear it extends by that same distributive law to the whole graded algebra: any two multivectors are multiplied by expanding each into blades and multiplying the blades. This single associative product on multivectors is the geometric algebra, and the vector product of Section 1.6 is its grade- case. On homogeneous blades the product resolves cleanly by grade.
The geometric product of a grade- blade and a grade- blade spreads over the grades ; its lowest grade is the inner product and its highest grade is the outer product,
For two vectors, , these are the grade- scalar and the grade- bivector , so the decomposition of [eq:geometric-decomp] is recovered. The outer product raises grade and the inner product lowers it, and the full geometric product carries both at once, which is what closes the algebra.
To make all of this precise, rather than resting on the manipulation of orthonormal blades, we give the defining property that characterises the algebra uniquely.
The geometric algebra is the associative unital real algebra, equipped with a linear inclusion , that is universal among algebras receiving with the contraction property: for every associative unital algebra and every linear map satisfying for all , there is a unique algebra homomorphism with .
The universal property fixes up to a unique isomorphism, so any construction that has it is the geometric algebra. One such construction proves that the object exists.
Let be the tensor algebra of , and let be the two-sided ideal generated by all elements . The quotient
is an associative unital algebra with the universal property of [clifford-universal], and it has dimension where , with the grade decomposition of [eq:dim-2n].
The quotient of [eq:clifford-quotient] is a clean existence proof: the tensor algebra is the free associative algebra on , imposing the relation is exactly the contraction axiom of [eq:contraction], and the universal property of the tensor algebra descends to the universal property of the quotient. The geometric algebra is therefore the most general associative algebra in which vectors square to their lengths, which is the sense in which it is the natural home of the metric. With existence and dimension in hand, we commit to the three-dimensional case the rest of the article inhabits.
1.8. The algebra
We now read the general construction of Section 1.7 off at , giving the algebra of physical space, in which the rest of the article computes. The label records the signature: three orthonormal directions each squaring to , with none squaring to or to . Its dimension is , split by grade as one scalar, three vectors, three bivectors, and one trivector.
Fix an orthonormal basis of , so that and for . The eight basis blades of , by grade, are
where and the bivectors are written in the cyclic order that mirrors the cycle. The top-grade blade is the pseudoscalar, the unit oriented volume of space.
The whole algebra is fixed by how the basis blades multiply, and that in turn follows from and the anticommutation of distinct generators, worked out once and tabulated. The entry in row and column is the geometric product .
| · | 1 | e1 | e2 | e3 | e23 | e31 | e12 | I |
|---|---|---|---|---|---|---|---|---|
| 1 | 1 | e1 | e2 | e3 | e23 | e31 | e12 | I |
| e1 | e1 | 1 | e12 | −e31 | I | −e3 | e2 | e23 |
| e2 | e2 | −e12 | 1 | e23 | e3 | I | −e1 | e31 |
| e3 | e3 | e31 | −e23 | 1 | −e2 | e1 | I | e12 |
| e23 | e23 | I | −e3 | e2 | −1 | −e12 | e31 | −e1 |
| e31 | e31 | e3 | I | −e1 | e12 | −1 | −e23 | −e2 |
| e12 | e12 | −e2 | e1 | I | −e31 | e23 | −1 | −e3 |
| I | I | e23 | e31 | e12 | −e1 | −e2 | −e3 | −1 |
Three features of the table carry the geometry. Each bivector squares to , as the diagonal entries show, so a unit bivector behaves like the imaginary unit, the algebraic root of the plane rotations that Section 1.5 found had no real eigenvalue. The pseudoscalar likewise squares to .
In the pseudoscalar satisfies
and it commutes with every element of the algebra, so the centre of is the set , a copy of the complex numbers. Commutativity with vectors is the content of the last row and column of the table, , and it is special to odd dimension.
That the centre is a copy of is the deep reason the complex numbers keep appearing in three-dimensional rotation problems; they are already inside the real algebra of space, as the even-dimensional subalgebra spanned by and the bivectors. We compute one product in full to see the decomposition of [eq:geometric-decomp] at work.
Take the two vectors and and form by expanding bilinearly and reading each basis product from the table, one term at a time,
From the table , , , and , so
The scalar part is , and it should equal , which it does. The bivector part should equal , whose components in the cyclic basis are with and , again matching. The single product of [eq:cl3-result] holds both the alignment and the oriented plane of and , exactly as [eq:geometric-decomp] requires.
With the multiplication table in hand, every later computation in the article is in principle a lookup. The next section uses the pseudoscalar to explain, and then retire, the cross product.
1.9. Duality and the cross product
The bivector of Section 1.6 is the honest object that the cross product only impersonated. This section makes the impersonation exact: multiplication by the pseudoscalar converts a bivector to the vector normal to it, and that single operation is what the cross product secretly is. Because that conversion needs the space of bivectors and the space of vectors to share a dimension, it works in three dimensions and nowhere else, and it is this identification of the cross product with a vector, rather than the existence of a vector-valued product, that is the dimensional accident.
The dual of a multivector in is its product with the inverse pseudoscalar,
the second equality holding because by [eq:I-squared]. Duality exchanges grades and : it sends vectors to bivectors and bivectors to vectors, scalars to pseudoscalars and back. It is the geometric-algebra form of the Hodge star, and since is central the side of multiplication does not matter.
Duality takes the three basis bivectors of [eq:cl3-basis] to the three basis vectors, and reading the last column of the multiplication table gives the correspondence in the cyclic labelling: , , and . The oriented plane , for instance, is dual to the vector normal to it, with the right-handed sense. This is precisely the trade that Section 1.5 flagged as buried inside the cross product, now performed in the open.
In , the cross product of two vectors is the dual of their outer product,
The identity holds in three dimensions alone, because it needs the space of bivectors and the space of vectors to have equal dimension so that can move between them; in the plane there are no vectors dual to the single bivector, and in four dimensions the dual of a bivector is another bivector. The pseudovector character of noted in Section 1.5 is now explained: itself changes sign under reflection, so the dual of an honest bivector reflects with the extra sign that makes it a pseudovector.
It is worth being exact about what three dimensions do and do not supply here. A bilinear vector-valued product with the length property exists not in three dimensions alone but also in seven, where it is inherited from the multiplication of the octonions. What is special to three dimensions is the identification of [eq:cross-dual], the reading of that product as the dual of a bivector. In seven dimensions the outer product is a bivector in a twenty-one-dimensional space, whose dual is a five-blade rather than a vector, so the seven-dimensional cross product, though it exists, is not the dual of a plane and falls outside the correspondence developed here. The whole of this article rests on the three-dimensional identification, and it is that identification, not the bare existence of a cross product, that we call a dimensional accident.
The moving figure shows the bivector and its dual normal together: as the plane tips, the vector tracks it, and the callout marks that the tracking is a coincidence of three dimensions.
The lesson for the rest of the article is a matter of policy. Wherever mechanics traditionally writes a cross product, an angular velocity, a torque, an angular momentum, we shall write the bivector that [eq:cross-dual] shows the cross product to be a disguise of. The bivector is defined in every dimension, reflects honestly, and needs no handedness convention, so the pseudovector complaints of Section 1.5 lapse. The dual remains available whenever a normal vector is genuinely wanted, but it is no longer the primary object. We turn next to the operation that made this section's normal directions in the first place, the rotations, and to the elements of the algebra that generate them.
1.10. Reflections and rotors
Rotations are the motions at the heart of rigid-body mechanics, and Section 1.5 found the matrix picture of them wanting: complex eigenvalues in the plane, an axis that is an accident of dimension. Geometric algebra generates rotations from a more primitive motion, the reflection, and packages them in single algebra elements, the rotors, that compose by multiplication and never lock. This section builds both, works a rotor in full, and records the double cover of the rotation group they realise.
A reflection is expressed by a sandwich of a vector between a unit vector and its inverse.
Let be a unit vector, , so that . The map
is the reflection of in the plane through the origin with normal , reversing the component of along and fixing the plane. In components it equals .
To see the identity, split into parts along and across . The parallel part commutes with and the perpendicular part anticommutes with it, by the two special cases of Section 1.6, so , which reverses the normal component alone. Now compose two reflections. Reflecting in the plane normal to and then in the plane normal to gives
and the product of two unit vectors is a new kind of element that effects a rotation. Two reflections make a rotation, and the angle of that rotation is twice the angle between the mirrors.
A rotor is a product of an even number of unit vectors, equivalently an even-grade element with scalar and a bivector, normalised so that , where the reverse reverses the order of all vector factors. A rotor acts on a vector by the sandwich
which is a rotation. For a unit bivector with , the rotor of a rotation through angle in the plane of is the exponential
the series collapsing to cosine and sine because . Products of rotors are rotors. In the general theory a product of any number of unit vectors is called a versor, which generates an orthogonal transformation by the sandwich; rotors are exactly the even versors and reflections the odd ones. Here the rotors form the group we need.
The exponential of [eq:rotor-exp] makes the bivector the direct generator of the rotation, in place of the antisymmetric matrix or the axis vector of the matrix language. The plane of turning is itself, present in every dimension, and the half-angle is the fingerprint of the two-reflection construction of [eq:two-reflections]. The figure spins a frame under the sandwich and sets the composition of two rotors beside the gimbal-lock path of Euler angles.
Rotate the vector through in the plane, whose unit bivector is with . The rotor of [eq:rotor-exp] is
Apply the sandwich [eq:rotor-sandwich] in two steps. First , using from the table. Then
From the table and , so the bracket is , giving . The vector has turned into , a quarter turn in the plane, exactly as a rotation through should, and the same computation with a general angle sends to .
One feature of [eq:rotor-exp] deserves the final word of the section. The rotor and its negative give the same rotation, since , and running the angle from to carries from to while the frame returns to where it started. The rotors of unit norm therefore form a group, , that wraps twice around the rotation group .[2]
The map sending a unit rotor to the rotation is a two-to-one group homomorphism from , the group of unit even-grade elements of , onto the rotation group , with and mapping to the same rotation. As a manifold is the unit sphere in the four-dimensional even subalgebra, so , and it is the reason a spin-half object must turn through to return to itself.
The reflection figure closes the section by composing two mirror operations into a single rotation, the geometric content of [eq:two-reflections].
Rotors are the objects that Part III integrates to follow a spinning body, free of Euler angles and gimbal lock. Before the physics, one section remains in the algebra: how a linear operator of Section 1.4 acts on the higher-grade blades, which is where the determinant finally becomes a product.
1.11. Operators in geometric algebra
A linear operator of Section 1.4 acts on vectors. Geometric algebra has bivectors, trivectors, and the pseudoscalar as well, and there is one canonical way to extend an operator to all of them at once. The extension is the outermorphism, and it settles the last discomfort of Section 1.5: the determinant, bolted on from outside in the matrix language, becomes the action of the operator on the pseudoscalar, a product within the algebra.
Let be a linear operator. Its outermorphism is the unique extension of to all of that fixes scalars, agrees with on vectors, and preserves the outer product,
It is grade-preserving, sending each -blade to a -blade, and it respects composition, .
The outermorphism is the natural way an operator drags oriented areas and volumes along with the vectors, since a blade is an oriented volume element and [eq:outermorphism] just transports each of its spanning vectors. Applied to the top blade, the pseudoscalar of [eq:cl3-basis], it can only rescale it, because the trivectors of are one-dimensional. The scale factor is the determinant.
For a linear operator on , the outermorphism acts on the pseudoscalar by
and this equation may be taken as the definition of . It is manifestly basis-independent, it gives the multiplicative law at once from , and its sign reports whether preserves or reverses the orientation carried by .
The determinant is now a statement about volume inside the algebra, rather than an alternating sum imported from outside it: is the factor by which inflates the unit oriented volume . Writing and expanding by [eq:outermorphism] reproduces the familiar permutation-sum formula, so the two definitions agree, but only [eq:det-pseudoscalar] makes the meaning plain. The figure morphs a grid and a volume cell under an operator and reads the determinant off as the volume scale.
The inner product supplies one further operator, the adjoint, which the inertia tensor of Part III will need.
The adjoint of a linear operator is defined through the inner product of Section 1.3 by
In an orthonormal basis its matrix is the transpose, . An operator is symmetric when and orthogonal, a rotation or reflection, when , so that it preserves every inner product.
An operator is diagonalised, when it can be, by finding the blades it merely rescales.
A nonzero blade of grade is an eigenblade of with eigenvalue when . A grade-one eigenblade is an ordinary eigenvector. A symmetric operator on has three mutually orthogonal eigenvectors with real eigenvalues, and the product of the three eigenvalues is by [eq:det-pseudoscalar], since scales the volume spanned by the eigenvectors by each eigenvalue in turn.
This closes the algebra of a single instant. The Euclidean space of Sections 1.1 through 1.4 has grown, through the geometric product, into the algebra with its bivectors, rotors, duality, and outermorphisms, and every discomfort catalogued in Section 1.5 has been met. What remains is to let the instant move: to differentiate the vector-valued functions of time and space that mechanics is made of, which is the work of Part II.
Part II
2. The two calculi
2.1. Vector-valued functions and the Fréchet derivative
Mechanics is written in vector-valued functions. A trajectory is a map from an interval of time to , a force field a map from to , a change of coordinates a map from one copy of to another. To differentiate any of them we need a notion of derivative for a map between vector spaces, and the faithful one is the best linear approximation to the map near a point, a linear map of exactly the kind Section 1.4 gave a matrix. The derivative is not the array of partials; it is the linear map those partials represent.
A map is differentiable at a point when there is a linear map , its derivative, for which the increment splits into a linear part and a remainder that dies faster than the displacement,
The linear map is unique when it exists, and it is the same object whatever bases are later chosen to write it down.
The derivative of [eq:frechet] is a linear map, and by Section 1.4 every linear map acquires a matrix once bases are fixed. That matrix, read off through the entries of [eq:matrix-entries], is the Jacobian.
In the standard bases of and , the matrix of the derivative of [eq:frechet] is the Jacobian,
whose -th column is the partial-derivative vector at . When the Jacobian is square and its determinant is defined.
The derivative is one object; its Jacobian [eq:jacobian] is the basis-dependent shadow, exactly as the column [eq:column-vector] is the shadow of a vector and the matrix [eq:matrix-entries] the shadow of an operator. The linearity of the derivative delivers the chain rule at no cost: the derivative of a composite is the composite of the derivatives, so its Jacobian is the matrix product , the composition law of Section 1.4 transcribed for derivatives. We compute one Jacobian and read its determinant as a volume scale, tying it to the outermorphism of Section 1.11.
The polar map carries the plane's polar coordinates to Cartesian ones. Its partials fill the columns of [eq:jacobian]: differentiating each component with respect to and then ,
Its determinant is
which is the factor by which the map inflates an area element, the outermorphism scale [eq:det-pseudoscalar] of Section 1.11 in dimension two. It is exactly the that weights the polar area element, and it vanishes at the origin, where the map folds the whole -line onto one point and ceases to be invertible. The determinant of the derivative, computed here, is the local measure that Section 2.4 integrates.
2.2. The vector derivative
The Jacobian of Section 2.1 collects every partial derivative into an array. Geometric calculus assembles the same partials into a single object of the algebra, a vector whose components are the derivative operators . Because it is a vector, it multiplies fields by the geometric product of Section 1.6, and that one product is what unifies the differential operators of vector analysis in Section 2.3.
On with a basis and its reciprocal basis fixed by , the vector derivative is the grade-one operator
It acts on a multivector field by the geometric product, one basis direction at a time,
In an orthonormal basis the reciprocal basis coincides with the basis, , and .
The reciprocal basis is what makes [eq:vector-derivative] coordinate-free: under a change of basis the vectors and the covectors transform inversely, so the sum is one and the same operator in every frame, a vector rather than a basis-bound list, in the sense Section 1.2 drew between an object and its representation. Contracting with a fixed direction gives the rate of change along that direction,
the directional derivative, which is the derivative of Section 2.1 evaluated on the vector , since .
The single most important feature of is its grade. It is a vector, so its geometric product with a field carries the two-part structure of [eq:geometric-decomp]: an inner part that lowers grade and an outer part that raises it. Acting on a scalar field there is no lower grade to reach, so the product is purely the grade-raising part, a vector, the gradient
which points along the steepest ascent of and whose magnitude is that slope. Acting on a vector field there is both a lower grade and a higher one to reach, and the two parts are the divergence and the curl of the next section.
2.3. Grad, div, and curl unified
For two vectors the geometric product splits as by [eq:geometric-decomp], a scalar plus a bivector. Nothing in that decomposition used that was a constant vector, so it holds verbatim when the first factor is the vector operator of Section 2.2. The scalar part is the divergence and the bivector part is the curl, and the two together are the whole first-order derivative of a vector field.
For a vector field on , the geometric product decomposes into its scalar and bivector parts,
the scalar part being the divergence and the bivector part the curl,
Together with the gradient [eq:grad-def] on scalar fields, the three operators of vector analysis are grade parts of the single product .
The curl of [eq:div-curl] is a bivector, the honest oriented-plane object of Section 1.6, and it is defined in every dimension. The familiar curl vector of three-dimensional vector analysis is its dual: applying the duality of Section 1.9 and the relation [eq:cross-dual] gives , so the classical curl is the bivector seen through the pseudoscalar, and it inherits the pseudovector character flagged in Section 1.5. The bivector is the primary object; the axial vector is a three-dimensional shadow of it. We compute the whole product for a field and read the three quantities off.
Take the vector field and form in the orthonormal basis. Differentiating the field along each axis, one direction at a time,
and multiplying each on the left by its basis vector,
Reading each basis product from the table of Section 1.8, with , , , and , the eight terms collect into a scalar and a bivector,
The scalar part is the divergence , and the bivector part is the curl . Dualising the curl by [eq:cross-dual], with , , and , recovers the classical curl vector . To check, the component predicted by [eq:div-curl] is , matching the coefficient in [eq:nabla-F-result], and the divergence matches likewise. The single product [eq:nabla-F-result] holds the divergence and the whole curl at once.
2.4. The fundamental theorem of geometric calculus
One operator has one integral theorem. It states that integrating the vector derivative of a field over a region equals integrating the field itself over the boundary, once each integral is taken against the oriented measure the algebra supplies. The named theorems of vector calculus, Green's, the divergence theorem, and Stokes', are its grade parts, and the ordinary fundamental theorem of calculus is its one-dimensional case.[1]
Let be a compact oriented -dimensional region with boundary . Let be the directed measure on , the oriented unit -blade tangent to scaled by the ordinary volume element, and the directed measure on , the oriented -blade of its boundary. Then for any multivector field ,
where differentiates the field alone. When is a region of of full dimension , the tangent blade is the constant pseudoscalar of Section 1.8, so .
The content of [eq:ftgc] is that the vector derivative and the boundary operation are adjoint under integration: differentiating inside is the same as restricting to the edge. Every classical integral theorem is [eq:ftgc] read at a particular grade. Take a solid region of and a vector field, so that and the boundary measure is with the outward unit normal. Multiplying [eq:ftgc] on the left by and keeping the scalar part turns into its divergence [eq:div-curl] and into , giving the divergence theorem
Keeping instead the bivector part, on a surface rather than a solid, gives Stokes' theorem for the curl [eq:div-curl], and the whole construction in the plane gives Green's theorem; on an interval, with a scalar field, [eq:ftgc] is the fundamental theorem of calculus . One theorem, projected onto each grade in each dimension, is the whole zoo. We ground it on the simplest planar case.
Work in the plane with the radial field and let be the unit disk. Its divergence is , a constant, so the left side of [eq:divergence-theorem] is twice the area of the disk. On the boundary circle the outward unit normal is the position vector itself, , so the flux integrand is on , and the right side is the circumference. Both sides agree,
The area integral of the divergence equals the boundary flux, as [eq:divergence-theorem] and, above it, [eq:ftgc] require.
2.5. The matrix-calculus zoo versus one theorem
This section is the second thesis anchor of the article. Where Section 1.5 catalogued the discomforts of the matrix language for the algebra of a single instant, here we tally the cost of the classical apparatus for the calculus, and set against it the single operator and single theorem of geometric calculus.
The classical vector calculus meets each geometric task with a bespoke tool. It carries three first-order differential operators with incompatible type signatures: the gradient sends a scalar field to a vector field, the divergence sends a vector field to a scalar field, and the curl sends a vector field to another vector field. The three cannot be composed freely, since the output of one is rarely a legal input to the next, and each is proved and remembered on its own. Worse, the curl vector exists in three dimensions alone, for it is built on the same trade of an oriented plane for its normal that made the axial cross product a dimensional accident in Section 1.9; in the plane there is no curl vector, and in four dimensions there is no vector to carry it, though the bivector curl is defined in every dimension. The integral calculus is as fragmented. The fundamental theorem of calculus governs an interval, Green's theorem a planar region, the divergence theorem a solid, and Stokes' theorem a surface, four statements with four proofs, each welded to its dimension and its grade.
Geometric calculus replaces the whole apparatus with one operator and one theorem. The vector derivative of [eq:vector-derivative] is a single grade-one object; acting by the geometric product it produces the gradient on a scalar field and, on a vector field, the divergence and curl together as the scalar and bivector parts of [eq:nabla-decomp]. The fundamental theorem [eq:ftgc] is a single statement from which Green's, the divergence, and Stokes' theorems fall out as grade projections, as Section 2.4 showed. The correspondence is exact and it is worth tabulating.
| Classical apparatus | Type and range | Geometric calculus |
|---|---|---|
| gradient | scalar to vector, any | on a grade- field |
| divergence | vector to scalar, any | scalar part |
| curl | vector to pseudovector, only | bivector part , any |
| calculus, Green, Gauss, Stokes | four theorems, one per dimension | one theorem [eq:ftgc] |
The saving is the same in kind as the first anchor. There the geometric product absorbed the inner product, the cross product, and the determinant into one associative multiplication; here the vector derivative absorbs the gradient, divergence, and curl into one operator, and the fundamental theorem absorbs four integral theorems into one identity. In both cases the apparatus that the matrix language spreads across special cases and special dimensions is revealed as grade parts of a single object, defined in every dimension and needing no handedness convention. The figure shows the vector derivative at work on a flowing field, its scalar divergence and bivector curl live as the field deforms, and the boundary flux of [eq:ftgc] accumulating in step.
The two calculi are now level, as the two languages were at the close of Part I. Position, its derivative, and the operator that differentiates fields are all in hand, in both tongues, and every discomfort of the matrix calculus catalogued above has an answer inside the algebra. What remains is to set the machinery in motion, and to let it carry the physics itself. That is the work of Part III, where the position vector of Section 1.1 acquires a mass and a law, and the bivector of Section 1.6 becomes the angular momentum of a spinning body.
Part III
3. Newtonian and rigid-body dynamics
3.1. Kinematics
Physics begins where the algebra and the calculus of the first two parts are set in motion. The trajectory, velocity, and acceleration were already won from the inertial arena in Section 1.1, where the position vector was allowed to depend on time and its first two derivatives were read off as the arrows tangent to the path. Here we take them up in the coordinate space the dynamics works in, recall the one estimate they rest on, and carry them to the momentum that Newton's laws govern.
Recalling Section 1.1, a trajectory is a smooth vector-valued function of time, and its velocity and acceleration are its first and second time derivatives,
each a vector in the same space as the position, with the overdot abbreviating throughout.
The one-parameter derivative of [eq:velocity-acceleration] is the same limit the increment bound of [increment-bound] controls, and it is the ordinary one-parameter case of the Fréchet derivative of [eq:frechet]: with a single scalar input the linear map is multiplication by the vector , so the derivative and the velocity are one object. Because differentiation acts componentwise on a vector-valued function, the velocity in an orthonormal basis is , the basis vectors being constant in an inertial frame. That last clause is the whole content of the frame being inertial for kinematics, and Section 3.5 is precisely the study of what the velocity becomes when the basis is allowed to turn.
Two derived quantities record the shape of the trajectory rather than its parametrisation. The speed is the length of the velocity, built from the inner product of Section 1.3, and differentiating gives , so the acceleration is orthogonal to the velocity exactly when the speed is momentarily stationary. The bivector of Section 1.6 is the oriented plane of the motion, and it vanishes precisely on straight-line paths, where velocity and acceleration are parallel. These two invariants, the scalar and the bivector , are the grade parts of the single geometric product , and they carry the curvature and the turning of the path without any appeal to a cross product.
3.2. Newton's laws
Kinematics describes how a body moves; dynamics says what makes it move so. The bridge is the mass, a positive scalar attached to the body, and the law that ties the rate of change of its motion to the force upon it. This section states Newton's three laws in the vector language of Section 1.1, with the momentum as the primary dynamical variable, since it is the momentum and not the velocity that the second law governs and the third law conserves.
The mass of a body is the scalar measuring its resistance to being accelerated. Its momentum is the vector
the mass times the velocity of [eq:velocity-acceleration], pointing along the direction of motion with magnitude .
The three laws of Newton are then a statement about momentum in an inertial frame. The first law asserts that such frames exist: there is a class of frames, the inertial ones of Section 1.1, in which a body subject to no force keeps its momentum constant, tracing the straight line at constant speed that defines the frame. The first law is not a special case of the second but its precondition, fixing the arena in which the second law is to be read.
In an inertial frame, the force on a body is the vector equal to the time rate of change of its momentum,
which for constant mass reduces to with the acceleration of [eq:velocity-acceleration]. Force and acceleration are genuine vectors of , adding by the parallelogram rule of Section 1.2.
The equation [eq:second-law] is a vector equation, three scalar equations at once, and it inherits from Section 1.2 the fact that forces superpose: the total force is the vector sum of the individual forces, and the single resultant drives the momentum. The third law completes the account for interacting bodies: the force exerted by a first body on a second is met by an equal and opposite force of the second on the first, , so that the total momentum of an isolated pair, and by extension of any isolated system, is conserved. Summing [eq:second-law] over the parts of such a system makes every internal force cancel in pairs, leaving , the momentum form of the law that governs the whole. With force and momentum in hand as vectors, the next section turns to the quantity that the same construction produces one grade higher, the angular momentum.
3.3. Angular momentum as a bivector
The rotational analogue of momentum is where the matrix language first pays the price catalogued in Section 1.5. Tradition writes the angular momentum as the cross product , a pseudovector that flips with the wrong sign under reflection and can be formed as a vector only in three dimensions. The honest object is the one the cross product was impersonating: the oriented plane swept by the position and the momentum, the bivector of Section 1.6. This section takes the bivector as the definition and shows that the law it obeys is cleaner than the one the pseudovector could state.
The angular momentum of a body about the origin is the bivector
the oriented plane spanned by the position and the momentum, with magnitude the area rate of the parallelogram they subtend. The torque of a force about the origin is likewise the bivector
the oriented plane of the position and the force.
Both quantities are grade-two elements of , defined by the outer product alone, so they carry no hidden handedness and reflect as honestly as the plane they name. The pseudovectors and of the matrix account are their duals, and so on through [eq:cross-dual], the normal vectors read off the planes by the pseudoscalar of Section 1.9; the plane is primary and the axial vector its three-dimensional shadow. The law relating the two is the exact rotational counterpart of the second law [eq:second-law], and it follows from that law in one line.
For a body moving under a total force by the second law [eq:second-law], the angular momentum bivector [eq:angular-momentum] and the torque bivector [eq:torque] satisfy
The derivation is a differentiation of the outer product, and it turns on the antisymmetry of the wedge. Differentiating [eq:angular-momentum] by the product rule, which the outer product obeys because it is bilinear,
The first term vanishes: , since the outer product of any vector with itself is zero by the antisymmetry of Section 1.6. The second term is by the second law, which is the torque [eq:torque]. So , exactly [eq:dL-dt-tau]. The vanishing of the velocity term is the geometric fact that a body cannot exert torque on itself by moving along its own line of motion, and the wedge records it automatically where the cross-product bookkeeping had to invoke the parallel-vector rule by hand. The figure carries the oriented plate of alongside the orbiting body.
The immediate corollary is the conservation law that drives the next section. When the torque [eq:torque] vanishes, the angular momentum bivector [eq:angular-momentum] is constant in time, plane and magnitude both fixed. A vanishing torque means the force lies along the position vector, and forces of that kind are the central forces, to which we now turn.
3.4. Central forces and Kepler's areal law
A central force is one directed always along the line joining the body to a fixed centre, its magnitude depending only on the distance. Gravity and the Coulomb force are of this kind, and the whole of the two-body problem lives here. The bivector law [eq:dL-dt-tau] of the previous section makes the central case transparent: the torque vanishes identically, the angular momentum plane is frozen, and Kepler's second law drops out as the statement that the frozen bivector is swept at a constant rate. This section derives that areal law and reads it off a worked orbit.
A central force about the origin is a force of the form
directed along the unit position vector , with scalar magnitude a function of the distance alone. It is attractive where and repulsive where .
Because the force [eq:central-force] is parallel to the position, its torque bivector [eq:torque] is , the outer product of a vector with a multiple of itself. The rotational law [eq:dL-dt-tau] then reads , so the angular momentum is conserved.
Under any central force [eq:central-force], the angular momentum bivector [eq:angular-momentum] is constant,
In particular the motion is confined for all time to the fixed plane of , since forces to lie in the plane the bivector names.
That the whole orbit lies in one plane, a fact the pseudovector account states as the constancy of an axis, is here the constancy of the plane itself, which is what physically holds. The conserved magnitude carries Kepler's law. Write the bivector as and note that is the oriented area swept per unit time by the radius vector, since in a time the radius sweeps the triangle of oriented area .
The areal velocity of a trajectory is the oriented area swept per unit time by the position vector,
a bivector equal to the angular momentum divided by twice the mass.
Kepler's second law is now immediate: since is constant by [eq:L-const], the areal velocity [eq:areal-velocity] is constant, so the radius vector sweeps equal oriented areas in equal times. The law that Kepler inferred from Tycho Brahe's tables of Mars is the bare statement that the bivector does not change, plane or magnitude, and it holds for every central force, not gravity alone. We ground it on a circular orbit, where every quantity is elementary and the constancy can be checked outright.
Let a body of mass move on the circle of radius at constant angular rate in the plane,
which is the path a central force supplies when its inward pull matches the required acceleration. Differentiating, the velocity is . Form the areal bivector one product at a time,
Expanding the wedge and using and , the cross terms combine through to give
a bivector independent of , so the areal velocity [eq:areal-velocity] is the constant and the angular momentum is . The time to sweep the whole disc is the period , and the area swept is , the area of the circle, as it must be. The constancy of [eq:circle-result] is Kepler's second law in the one case where it can be read by inspection, and the general orbit differs only in that and the angular rate vary while their combination in [eq:areal-velocity] stays fixed.
3.5. Rotating frames and the Coriolis bivector
The inertial frame of Section 1.1 was chosen so that the basis vectors stay fixed and the velocity of Section 3.1 is a plain componentwise derivative. Much of mechanics is done instead in a frame that turns, the rotating Earth being the standard example, and there the basis vectors are themselves functions of time. The rotor of Section 1.10 is exactly the instrument for carrying one frame into another, and letting it depend on time produces the angular velocity as a bivector and the Coriolis and centrifugal terms as its consequences.
Let a time-dependent rotor relate the fixed inertial basis to the rotating basis through the sandwich of [eq:rotor-sandwich]. Because at every instant, differentiating that constraint pins the rate of the rotor to a bivector.
For a time-dependent rotor , the angular-velocity bivector in the space frame is
Differentiating gives , so equals minus its own reverse and is therefore a pure grade-two element, a bivector, with no scalar part. Its body-frame form is the transported bivector .
That is a bivector is the rotational counterpart of the fact that velocity is orthogonal to position on a sphere: the rotor lives on the unit sphere of the even subalgebra of Section 1.10, and its velocity is tangent there, which is precisely the grade-two condition. The single bivector replaces the antisymmetric matrix and the axial angular-velocity vector of the matrix account at once, and it names the plane of turning directly, in the manner of [eq:rotor-exp].
The bivector governs how the time derivative seen in the rotating frame differs from the one seen in the fixed frame. Let be the body-frame description of a body's position, so that its space position is . Differentiating this sandwich with [eq:angular-velocity] and collecting the rotor terms through the contraction of a vector with a bivector gives the transport rule.
For any body-frame vector , the space-frame velocity, expressed in the rotating frame, exceeds the naive rotating-frame rate by a term built from the angular-velocity bivector,
where is the grade-one contraction of the vector with the bivector, the geometric-algebra form of the term written in the matrix language.
Applying the rule [eq:transport-rule] a second time, to the velocity in place of the position, unfolds the acceleration into four terms, since each of the two summands in [eq:transport-rule] is itself transported and the bivector may vary. Newton's second law [eq:second-law], written in the rotating frame, then carries three extra terms beside the true force,
with the true force referred to the rotating frame. The three added terms are the inertial forces of the turning frame, each now a contraction against the angular-velocity bivector rather than a cross product. The Coriolis term acts on a body only while it moves within the frame and turns its path sideways in the plane of ; the centrifugal term points outward from the axis of the plane and grows with distance from it; and the last term, present only when the rotation rate itself changes, is the azimuthal or Euler force. The figure deflects a body crossing a spinning platform and marks the plane of in which the deflection lies.
The Coriolis term is what deflects the trade winds and swings the plane of a pendulum, and the bivector makes its geometry plain: the deflection lies in the plane of rotation and is proportional to the speed across that plane, with no axis to invoke and no handedness to remember. The rotor that generated here as an external frame becomes, in the next section, the very orientation of a rigid body, and its equation of motion is the third thesis anchor of the article.
3.6. Rigid-body rotor kinematics
This section is the third thesis anchor of the article. A rigid body is a collection of points whose mutual distances never change, so its configuration is fixed once the position of one reference point and the orientation of the body are given. The orientation is the whole difficulty, and it is where the matrix language, and worse the language of Euler angles, breaks down. We carry the orientation by a rotor, and its equation of motion is a single first-order bivector law that never locks and composes by multiplication.[3]
Fix a reference configuration of the body and label each of its points by the constant vector it occupies there. At time the body has turned, and the same material point sits at the space position obtained by the rotor sandwich of [eq:rotor-sandwich],
so that a single rotor , the same for every point, carries the entire orientation of the body. The rigidity is automatic: the sandwich is a rotation, and rotations preserve every inner product by Section 1.11, so all mutual distances and angles are frozen for free. The motion of the whole body is thus reduced to the motion of one rotor on the group of Section 1.10.
The orientation rotor of a rigid body evolves by the first-order bivector law
where is the space-frame angular-velocity bivector of [eq:angular-velocity]. Equivalently in terms of the transported body-frame bivector , and either form preserves the normalisation exactly.
The law [eq:rotor-equation] is the same relation [eq:angular-velocity] that defined the angular velocity, now read as an evolution equation for the rotor once is supplied by the dynamics of the following sections. That it preserves the normalisation is a one-line check that is worth seeing, since it is what keeps the integrated orientation a genuine rotation for all time. Differentiate and use [eq:rotor-equation] together with the fact that the reverse of a bivector is its negative, ,
The norm is conserved to all orders, so a rotor integrated from [eq:rotor-equation] stays on and the sandwich [eq:rigid-config] stays an exact rotation, where a matrix integrated from the corresponding equation drifts off the orthogonal group and must be reorthogonalised by hand.
The contrast with the classical apparatus is the substance of the anchor. Orientation in the matrix language is carried by three Euler angles, and the equations relating their rates to the angular velocity contain a division by the sine of the second angle, which vanishes when two of the rotation axes align. There the mapping from angles to orientation degenerates, two of the three freedoms collapse into one, and the equations of motion blow up though the body itself does nothing singular. This is gimbal lock, and it is an artefact of coordinatising the orientation by angles, not a feature of the physics. The rotor law [eq:rotor-equation] has no such defect: it is a single equation on the whole group, with no coordinates on the orientation and so no coordinate singularity anywhere, and the right side is a smooth bivector at every configuration. Composing two orientations is the product of their rotors, associative and closed by Section 1.10, where Euler angles compose by an awkward and non-commutative table and rotation matrices carry nine numbers under six constraints. One rotor, four numbers under one constraint, integrated by one bivector equation that never locks: that is the third promise of the article kept.
With the kinematics of the rotor settled, the remaining task is the dynamics, which needs the relation between the angular velocity that drives [eq:rotor-equation] and the angular momentum that the torque changes. That relation is the inertia operator, and it acts on bivectors.
3.7. The inertia operator on bivectors
The angular velocity of Section 3.6 is a bivector, and so, by Section 3.3, is the angular momentum. The object that relates them is the inertia of the body, and its natural home is therefore not a three-by-three matrix acting on axial vectors but a linear operator on the grade-two space of bivectors, of exactly the kind Section 1.11 extended to blades. This section builds that operator, shows it delivers the angular momentum from the angular velocity, and reads the kinetic energy off it.
Consider a rigid body turning about a fixed point, so that each material point at body position has velocity by the contraction of Section 3.5. Summing the angular momentum [eq:angular-momentum] over the body, with mass elements at positions , gives , which is linear in the bivector . That linear dependence is the inertia operator.
The inertia operator of a rigid body about a fixed point is the linear map on bivectors
sending a grade-two element to a grade-two element, with the sum running over the mass elements at body positions . For a continuous body the sum is the mass integral .
The operator [eq:inertia-operator] is the geometric-algebra form of the inertia tensor, a symmetric operator in the sense of the adjoint of [eq:adjoint], now carried by the bivector scalar product that pairs two bivectors to a scalar. Its symmetry, , follows from the manifest symmetry of [eq:inertia-operator] under exchange of and in the sum, and it guarantees, by the eigenblade proposition of Section 1.11, three mutually orthogonal eigen-bivectors with real positive eigenvalues.
Being symmetric, the inertia operator [eq:inertia-operator] has three orthogonal principal eigen-bivectors with positive eigenvalues, the principal moments of inertia ,
each eigen-bivector being the plane orthogonal to a principal axis of the body. In the body frame aligned with these planes the operator is diagonal, and a general angular velocity has angular momentum .
The two dynamical quantities the body needs are now both readable from the operator. The angular momentum is its value on the angular-velocity bivector, and the kinetic energy is half its associated quadratic form.
For a rigid body turning with angular-velocity bivector , the angular momentum bivector and the kinetic energy are
the energy a positive-definite quadratic form in the components of along the principal planes, since each moment .
The first equation of [eq:L-inertia] is the summed angular momentum computed above, now named; the second is the summed kinetic energy , which the symmetry of the operator collects into the single scalar product . Both are the honest bivector statements of quantities the matrix language writes with the inertia matrix acting on an axial angular-velocity vector; here no vector is fished out of a plane, and the operator acts within the grade-two space throughout. The figure integrates a free top under the rotor law [eq:rotor-equation] and displays the angular-velocity and angular-momentum bivectors as oriented plates, distinct whenever is not a principal eigen-bivector.
That and point along different planes unless the body turns about a principal axis is the whole source of the rich behaviour of a spinning body. The angular momentum is fixed in space when no torque acts, by Section 3.3, while the angular velocity that the rotor law integrates is a different bivector that must therefore wander. Working out how it wanders is the torque-free Euler equation, the final section of this part.
3.8. Euler's equations without the cross product
The angular momentum of a torque-free body is fixed in space, while its angular velocity is a different bivector by Section 3.7, so the body-frame components of the angular velocity must change even when nothing acts on the body. The equations that govern that change are Euler's equations, and the matrix account writes them with a cross product of the angular velocity and the angular momentum. We derive them instead as a bivector rate law, a commutator, with the cross product of Section 1.9 nowhere in sight, and then read off the free top and its instabilities.
The derivation transports the conservation law into the body frame. In the space frame the angular momentum is constant under zero torque by [eq:dL-dt-tau], where is the body-frame angular momentum of [eq:L-inertia]. Differentiating with the rotor law [eq:rotor-equation] and stripping the sandwich rotors leaves a first-order equation for the body-frame angular momentum alone. The natural product in it is the commutator of two bivectors.
The commutator product of two bivectors is half their commutator under the geometric product,
which is again a bivector, a grade-two element of . It is the closed product on the space of bivectors, and it is the object that carries the rotational dynamics, distinct from the retired vector cross product of Section 1.9 though it shares the notation.
With this product the torque-free law takes one line.
For a rigid body under no torque, the body-frame angular-velocity bivector obeys
the rate of change of the body-frame angular momentum equal to its commutator product with the angular velocity. A nonzero body-frame torque adds to the right side.
Written in the principal basis of [eq:principal-axes], with , the single bivector equation [eq:euler-bivector] resolves into its three grade-two components, and the commutator products of the principal eigen-bivectors reproduce the classical Euler equations exactly,
with no cross product anywhere in the derivation; the antisymmetric coupling that the matrix account gets from is here the commutator [eq:commutator-product] of the bivectors, which lives in every dimension and needs no dual. Two scalars are constant along any solution of [eq:euler-scalar]: the kinetic energy of [eq:L-inertia], and the squared magnitude of the space angular momentum , the first because the commutator product is orthogonal to its arguments and the second because is fixed in space.[3]
Take a body with two equal principal moments, , the symmetric top. The third of Euler's equations [eq:euler-scalar] reads , so the spin about the symmetry axis, , is constant. The remaining two become, on writing the constant ,
a plane rotation of the pair at the constant angular rate . The body-frame angular velocity therefore keeps a fixed component along the symmetry axis and a component of fixed length that precesses steadily around it. To check the conserved quantities, , so both the energy and the momentum magnitude are constant, as the general law requires. The tip of traces a circle about the symmetry axis, the simplest instance of the cone described next.
The trajectory of the angular-velocity bivector is a pair of cones. In the body frame the tip of is confined to the intersection of the energy surface and the momentum surface , two ellipsoids, and that intersection is a closed curve called the polhode. Seen in the space frame the same motion sweeps a second curve, the herpolhode, as the body rolls its polhode on the invariable plane perpendicular to the fixed . The figure traces both curves and lets the sign of the middle equation of [eq:euler-scalar] tell the stability story.
Order the moments and examine rotation nearly about each principal axis in turn, perturbing the two small components while the third stays near its steady value. About the largest axis, set and small; differentiating the second Euler equation of [eq:euler-scalar] and substituting the third gives
whose coefficient is negative, since and , so oscillates and the axis is stable. The identical computation about the smallest axis, with steady, again gives a negative coefficient and stability. About the middle axis, with steady, the two factors are and , the coefficient is positive, and the perturbation grows exponentially: rotation about the intermediate axis is unstable. This is the tennis-racket theorem, the flip of a spun book or racket about its middle axis, and it falls out of the sign of a single product in [eq:euler-scalar], the same commutator [eq:commutator-product] that replaced the cross product throughout.
This closes the Newtonian and rigid-body dynamics. The position vector of Section 1.1 has acquired a mass and a law, the bivector of Section 1.6 has become the angular momentum and the angular velocity of a spinning body, and the rotor of Section 1.10 has become the orientation itself, integrated by the singularity-free law [eq:rotor-equation] that is the third thesis anchor. What remains is to recast the whole of mechanics in the variational language of Lagrange and Hamilton, in both tongues, which is the work of Part IV.
Part IV
4. Lagrangian and Hamiltonian mechanics
4.1. Configuration space and generalised coordinates
The three parts before this one followed a body through space by watching its position vector, and a rigid body by watching its orientation rotor. Both are instances of one idea: a mechanical system is described at each instant by a point in a space of its possible arrangements, and mechanics is the study of the curve that point traces. This part recasts the dynamics of the earlier parts in that language, first in generalised coordinates and then, in Section 4.3, on the rotor group itself. We stay in coordinates throughout, as the Introduction promised, and treat the arrangement space as a coordinate patch rather than a manifold.
The configuration space of a mechanical system is the set of its geometrically possible arrangements, and a set of generalised coordinates is a tuple
of real numbers that names each arrangement uniquely on a region of that set, the integer being the number of degrees of freedom. A motion of the system is a curve through the coordinates, and its generalised velocity is the tuple of time derivatives .
The generalised coordinates need not be lengths. A single particle in space is named by its three Cartesian components, so and ; a plane pendulum of fixed length is named by the one angle its rod makes with the vertical, so and ; the orientation of a rigid body about a fixed point is named by a rotor and has , the count the double cover of Section 1.10 fixed. What reduces the coordinate count below the ambient one is a constraint, and the constraints that this account admits are the ones that can be written as equations among the coordinates alone.
A holonomic constraint on a system whose arrangements sit in is a finite set of relations
among the positions and the time alone, each independent of the velocities. The relations cut the ambient coordinates down to independent ones, and any tuple [eq:gen-coords] of that many free parameters solving [eq:holonomic] is a valid set of generalised coordinates.
The pendulum bob moves in the plane, two Cartesian coordinates, subject to the one holonomic relation that its distance from the pivot is fixed, leaving free coordinate, the angle. A bead on a rigid wire, a mass on a rigid rod, a body whose points hold fixed mutual distances: each is a holonomic system, and the rigidity of Section 3.6 is itself the holonomic constraint that froze the distances there. Writing the physics in the free coordinates [eq:gen-coords] builds the constraint in from the start, so no constraint force need ever be named; the coordinates cannot leave the allowed arrangements because they do not parametrise anything else.
One heuristic remark places the velocity of [eq:gen-coords] correctly, without the apparatus of manifolds we are doing without. At each configuration the generalised velocity ranges over a copy of , the space of rates a curve through may have, and the pair ranges over the numbers on which the Lagrangian of the next section is a function. The collection of all such pairs is what a later theory would call the tangent bundle of the configuration space; here it is simply the domain of coordinates and velocities on which mechanics is written, and we need no more of it than that.
4.2. The action and the Euler-Lagrange equations
With the configuration named by the coordinates of Section 4.1, the law of motion is stated not as a force balance at each instant but as a condition on the whole path at once. A single scalar is assigned to every candidate curve between two fixed configurations, and the curve the system actually follows is the one that makes that scalar stationary.[4] This section defines the scalar, states the principle, and derives from it the differential equations the true path obeys, with each step of the variation shown.
The Lagrangian of a system is a scalar function of the generalised coordinates [eq:gen-coords], their velocities, and possibly the time. For a system of particles it is the kinetic energy less the potential energy, . The action of a path run between times and is the integral of the Lagrangian along it,
a single real number assigned to the whole curve, so that is a functional on the space of paths with the two endpoints held fixed.
The action of [eq:action] takes a curve and returns a number, and the variational principle selects among curves by asking that number to be stationary. To say what stationary means, compare the true path with a neighbouring one, , where is any smooth deformation that vanishes at the two ends, , so that both curves join the same fixed endpoints. The action becomes an ordinary function of the single number , and stationarity is the vanishing of its derivative there.
A path with fixed endpoints is a physical motion when the action [eq:action] is stationary on it, meaning that for every deformation vanishing at and the first variation vanishes,
To turn the principle [eq:stationary] into equations, expand the derivative under the integral. Differentiating the action [eq:action] with respect to and setting , the chain rule applied to gives, summing over the coordinate index ,
The second term carries a derivative of the deformation, , where the first carries the deformation itself; to compare them we integrate the second by parts, moving the time derivative off ,
The boundary term of [eq:by-parts] vanishes, and this is exactly where the fixed endpoints earn their place: kills it outright. Substituting back into [eq:first-variation] collects the whole first variation onto with no derivative left on it,
The principle asks this to vanish for every admissible , and the last step is the lemma that turns an integral condition into a pointwise one.
If a continuous function on satisfies for every smooth vanishing at the endpoints, then identically on the interval.
The lemma is proved by contradiction: were nonzero at some interior point it would keep one sign on a small interval about it, by continuity, and choosing an that is a positive bump supported there and zero elsewhere would make the integral nonzero, against the hypothesis. Applied to each coefficient in [eq:variation-collected], with the deformations of the several coordinates independent, the lemma forces every bracket to vanish, which is the equation of motion.
A path is stationary for the action [eq:action] under fixed endpoints if and only if it satisfies, for each generalised coordinate,
These are the Euler-Lagrange equations, one second-order differential equation per degree of freedom.
The equations [eq:euler-lagrange-eq] reproduce Newton at once for a particle in a potential. Take Cartesian coordinates, ; then , whose time derivative is , while , so [eq:euler-lagrange-eq] reads , the second law [eq:second-law] with the force the gradient of the potential of Section 2.2. The gain is that the same [eq:euler-lagrange-eq] holds in any coordinates whatever, since the derivation never chose a frame, and the constrained systems of Section 4.1 are handled in their own free coordinates with no constraint force written down. We run the pendulum in full.
A bob of mass swings on a rigid rod of length in a vertical plane under gravity , its single coordinate the angle from the downward vertical. The bob position is , so its speed is and its height is , giving the kinetic and potential energies and hence the Lagrangian
Compute the two pieces of [eq:euler-lagrange-eq] one at a time. The velocity derivative is , whose time derivative is . The coordinate derivative is . The Euler-Lagrange equation is therefore , that is
As a check, for small angles this is , simple harmonic motion of angular frequency and period , the textbook small-swing result, and the constraint that the rod is rigid never appeared as a force because the single coordinate built it in.
4.3. Geometric-algebra Lagrangian mechanics
The coordinates of Section 4.1 serve a particle well, but the configuration of a rigid body is a rotor, an element of the algebra rather than a tuple of numbers, and coordinatising it by angles is exactly the move that gimbal-locked in Section 3.6. Geometric algebra lets us vary the rotor directly, as a multivector, and for that we need a derivative with respect to a multivector variable. This section builds it, states the Euler-Lagrange equation for a rotor-valued configuration, and recovers the Euler equations of Section 3.8 from a single Lagrangian on the group .
The scalar product of two multivectors, the scalar part of their geometric product, pairs the algebra with itself, and it is all we need to differentiate a scalar function of a multivector.
Let be a scalar-valued function of a multivector variable . Its multivector derivative is the multivector defined by its scalar product against every direction ,
the right side being the ordinary directional derivative of along . In a blade basis with reciprocal blades it reads , the geometric-algebra form of the gradient in the coordinates of the algebra, and it is grade-carrying like the vector derivative of Section 2.2 that it generalises.
A Lagrangian for a rotor-valued configuration is a scalar function of the rotor and its rate, and Hamilton's principle [eq:stationary] applies verbatim, the deformation now a rotor-valued vanishing at the ends. Repeating the variation of Section 4.2 with the multivector derivative in place of the partial derivatives, and integrating by parts against the scalar product, gives the Euler-Lagrange equation in the same shape as before.
A rotor path is stationary for the action of a Lagrangian under fixed endpoints if and only if
with and the multivector derivatives [eq:mv-derivative] with respect to the rotor and its rate. The rotor is constrained to by , and the cleanest route to the dynamics reduces [eq:rotor-el-eq] to the body-frame angular velocity before imposing that constraint.
The reduction is the substance of the section, and it recovers the free top. The kinetic energy of a body turning about a fixed point is the scalar of Section 3.7, built from the body-frame angular-velocity bivector of Section 3.5, and with no potential the Lagrangian is that energy alone,
Vary the rotor path and let be the bivector that measures the deformation in the body frame, the variational counterpart of as is of . Both and are pure bivectors, since the rotor lives on the unit sphere of the even subalgebra of Section 1.10 and its tangent directions are grade two. Differentiating the definitions and using the reverse of a bivector being its negative, the variation of the body angular velocity and the time derivative of the generator are related by
the commutator product [eq:commutator-product] of the bivectors appearing exactly as it did in the Euler law. This is the one identity the reduction turns on, and it is the statement that the body-frame velocity and its variation do not commute, precisely because the rotor group does not.
The Lagrangian [eq:spin3-lagrangian], made stationary over rotor paths with fixed endpoints, yields the torque-free Euler equation of Section 3.8,
identical to [eq:euler-bivector] and resolving in the principal basis into the scalar equations [eq:euler-scalar].
The derivation is the variation of Section 4.2 carried through in bivectors. Because the inertia operator is symmetric under the bivector scalar product, the first variation of the action is
using [eq:delta-omega] for . Integrate the term by parts, the boundary term dying because vanishes at the ends as in [eq:by-parts], and rewrite the commutator term with the invariance of the scalar product under the commutator, , the bivector form of the cyclic symmetry of the scalar triple product. Both steps move everything onto ,
Stationarity for every bivector deformation , by the fundamental lemma [fundamental-lemma] read on the bivector-valued integrand, forces the bracket to vanish, and since that is exactly [eq:euler-poincare-eq]. The rigid body of Part III has been recovered from a variational principle, with the orientation carried by a rotor throughout, no Euler angle introduced and no coordinate singularity anywhere, and the same commutator product that replaced the cross product in the equations of motion now emerging from the geometry of the variation itself.
4.4. Noether's theorem
The conservation laws that Part III proved one at a time, momentum from the absence of a force and the angular momentum plane from the absence of a torque, are in the Lagrangian picture instances of a single theorem. Every continuous symmetry of the action carries a quantity that the motion holds fixed, and the quantity is read off the symmetry by one formula. This section states and proves that correspondence, then reads energy and the angular momentum bivector of Section 3.3 off the symmetries that produce them.
A continuous symmetry is a family of deformations of the coordinates, indexed by a parameter , that leaves the Lagrangian unchanged. Write the infinitesimal deformation as , the generator of the family at .
A vector field on the configuration space generates a continuous symmetry of the Lagrangian when the infinitesimal change , together with the induced , leaves the Lagrangian invariant,
The theorem is proved in two lines by playing the invariance [eq:symmetry-invariance] against the equation of motion [eq:euler-lagrange-eq]. On a physical path the coordinate derivative equals by Euler-Lagrange, so substituting it into [eq:symmetry-invariance] and recognising the product rule,
the whole expression is a total time derivative, and the quantity inside it is constant along the motion.
To every continuous symmetry [eq:symmetry-invariance] of the Lagrangian there corresponds a conserved quantity, the Noether charge,
constant along every physical motion. If the deformation changes the Lagrangian by a total time derivative rather than leaving it strictly invariant, the conserved charge is .
The angular momentum of Section 3.3 is the charge of rotational symmetry, and geometric algebra states it as the whole bivector rather than a component. Let the Lagrangian of a particle in a central potential be turned by the rotor generated by a fixed unit bivector . The infinitesimal rotation is the contraction of Section 3.5, and it preserves both the speed and the distance from the centre, so [eq:symmetry-invariance] holds. The Noether charge [eq:noether-charge] is the momentum paired with the generator,
the scalar product of the generating plane with the angular momentum bivector of [eq:angular-momentum]. Since this is conserved for every choice of the plane , the whole bivector is conserved, which is the constancy of the angular momentum plane that Section 3.4 derived from the vanishing torque, now read as the consequence of rotational symmetry. Where the matrix account conserves the three components of an axial vector, the symmetry delivers the oriented plane directly.
The other cardinal example is energy, the charge of symmetry under translation in time. When the Lagrangian carries no explicit , shifting the whole motion earlier or later leaves the action unchanged, and the corresponding conserved quantity is built from the velocities and the Lagrangian.
When the Lagrangian has no explicit time dependence, , the energy
is conserved along every motion. For with quadratic in the velocities, , the total mechanical energy.
That [eq:energy] is constant follows from the same play of Euler-Lagrange against the chain rule: differentiating along the motion gives , and replacing the first factor by collects the right side into , so , which is [eq:energy]. The combination is the very object the next section makes into the Hamiltonian, and its conservation here is the first sign that it is the natural energy of the system.
4.5. The Legendre transform and the Hamiltonian
The Euler-Lagrange equations of Section 4.2 are second-order equations in the coordinates and velocities. There is a second formulation, equivalent in content and often cleaner in structure, that trades each velocity for a momentum and replaces the second-order equations by first-order ones on a doubled space. The passage between the two is a Legendre transform, and its output is the Hamiltonian. This section makes the trade and states Hamilton's equations, then works the pendulum in the new variables.
The quantity conjugate to a coordinate is the velocity derivative of the Lagrangian that already appeared inside the Euler-Lagrange equation and the Noether charge.
The conjugate momentum to the coordinate is
the same object whose time derivative is the left side of the Euler-Lagrange equation [eq:euler-lagrange-eq]. For a free particle is the ordinary momentum of [eq:momentum]; in general it is a different combination, carrying the index of the coordinate it is conjugate to.
The Legendre transform exchanges the velocity for the momentum as the independent variable, passing from the Lagrangian on coordinates and velocities to a new function on coordinates and momenta. The transform is well defined when the momenta [eq:conjugate-momentum] can be solved for the velocities, which holds when the Lagrangian is convex in the velocities, as a kinetic energy quadratic with positive mass always is.
The Hamiltonian is the Legendre transform of the Lagrangian in the velocities,
with every velocity on the right expressed through [eq:conjugate-momentum] as a function of the coordinates and momenta. The variables are the coordinates of phase space, and a state of the system is a single point there.
The Hamiltonian [eq:hamiltonian-def] is exactly the conserved energy [eq:energy] of Section 4.4, now regarded as a function on phase space rather than a quantity along a path. The equations of motion in the new variables follow by differentiating [eq:hamiltonian-def] and using the definition of the momentum to cancel the velocity terms, a short computation whose result is a symmetric pair.
The Euler-Lagrange equations [eq:euler-lagrange-eq] are equivalent to the first-order Hamilton equations
one pair per degree of freedom, governing the motion of the phase-space point.
To see the pair, take the differential of [eq:hamiltonian-def]. The velocity terms cancel by the definition [eq:conjugate-momentum] of the momentum, since loses its terms outright, leaving . Reading off the coefficients gives at once, and using the Euler-Lagrange equation, which is the second of [eq:hamilton-eqs-pair]. The single second-order law has become two first-order laws, symmetric under the near-exchange of and that the next section makes exact.
Return to the pendulum Lagrangian [eq:pendulum-lagrangian], . Its conjugate momentum [eq:conjugate-momentum] is , which inverts to . The Hamiltonian [eq:hamiltonian-def] is
the kinetic term now a function of the momentum and the potential unchanged. Hamilton's equations [eq:hamilton-eqs-pair] read
To check that these reproduce the Lagrangian result, differentiate the first and substitute the second: , exactly the pendulum equation [eq:pendulum-eom]. The two first-order equations of [eq:pendulum-hamilton-eqs] carry the same swing as the one second-order equation, and they do it as a flow of the phase-space point , whose portrait the next section animates.
4.6. The symplectic bivector and Poisson brackets
Hamilton's equations of Section 4.5 are already nearly symmetric between coordinate and momentum, and the last step exposes the structure that makes the symmetry exact. Phase space carries a distinguished bivector, and against it every observable acquires a rate of change written as a single bracket. This section introduces that bivector, defines the Poisson bracket it induces, casts Hamilton's equations in bracket form, and reads off the conservation of phase-space volume that the bracket structure guarantees.[5]
The pairing of each coordinate with its conjugate momentum is an oriented plane in phase space, and summing those planes over the degrees of freedom gives one bivector, in the exact sense of the grade-two objects of Part I, now built on the -dimensional phase space rather than on physical space.
On the phase space with coordinates , the symplectic bivector is the sum of the coordinate-momentum planes,
a fixed grade-two element pairing each coordinate direction with its conjugate momentum direction. It is nondegenerate, meaning no nonzero phase-space direction lies in the plane of every summand, and it equips phase space with the oriented-area measure that the dynamics preserves.
The symplectic bivector turns the gradient of any phase-space function into a flow, and applying that to two functions in turn gives their Poisson bracket, the antisymmetric pairing at the heart of the Hamiltonian formalism.
The Poisson bracket of two functions and on phase space is
the contraction of their gradients through the symplectic bivector [eq:symplectic-bivector-def]. It is bilinear and antisymmetric, , obeys the Jacobi identity, and on the coordinates themselves takes the canonical values
The fundamental bracket of [eq:canonical-brackets] is the phase-space statement that a coordinate and its own momentum are conjugate, and it is the one relation from which the whole bracket algebra is built. With the bracket in hand, Hamilton's equations [eq:hamilton-eqs-pair] collapse to a single form that governs every observable at once.
Any phase-space function with no explicit time dependence evolves along the motion by its bracket with the Hamiltonian,
and Hamilton's equations [eq:hamilton-eqs-pair] are the special cases and . A quantity is conserved exactly when its bracket with the Hamiltonian vanishes, and by antisymmetry recovers the conservation of energy of Section 4.4.
Equation [eq:bracket-motion] is the chain rule read through the bracket: , and substituting Hamilton's equations [eq:hamilton-eqs-pair] for and turns the right side into precisely the bracket [eq:poisson-bracket-def] of with . That a conserved quantity is one commuting with the Hamiltonian under the bracket is the Hamiltonian face of Noether's theorem, and the two examples of Section 4.4 reappear as for the energy and for the angular momentum plane in a central field.
The flow generated by the Hamiltonian preserves the symplectic bivector [eq:symplectic-bivector-def], and with it the oriented-volume measure that the bivector defines. This is Liouville's theorem, and it is the deepest structural fact of the Hamiltonian picture.
The phase-space flow of Hamilton's equations [eq:hamilton-eqs-pair] preserves the symplectic bivector [eq:symplectic-bivector-def], and therefore the phase-space volume it induces: a region of initial conditions carried along by the flow keeps its -dimensional volume for all time, though its shape may distort without limit. Equivalently the phase-space velocity field is divergence-free,
the two second derivatives cancelling because mixed partials commute.
The computation [eq:liouville-div] is the whole proof: the flow of Hamilton's equations has no sources or sinks, so it moves phase-space volume as an incompressible fluid moves. For a single degree of freedom the volume is an oriented area, and the flow of the symplectic bivector is the statement that any patch of the phase plane keeps its area as it is swept along, however the Hamiltonian shears it. The pendulum makes it visible: a ring of initial conditions in the plane of [eq:pendulum-hamilton-eqs] circulates and stretches under the flow, yet the area it encloses does not change.
The closed contours of the figure are the level sets of the Hamiltonian [eq:pendulum-hamiltonian], each a motion of fixed energy, the small ones near the bottom the gentle swings and the large ones the rotations over the top, with the dividing separatrix the swing that just reaches the upright. The advected patch keeps its oriented area exactly, the one-degree-of-freedom case of Liouville's theorem [liouville], and the symplectic bivector [eq:symplectic-bivector-def] is the invariant the whole flow respects. This is where the two languages meet again at the end: the oriented plane that opened the article as the outer product of two vectors returns as the symplectic bivector on phase space, the object mechanics conserves most deeply, and it is the same grade-two element throughout.
4.7. Coda: what geometric algebra bought us
The construction is complete. From an inertial frame and the single identification of a point with its position vector, the article built the algebra of a single instant, the calculus that differentiates it, the Newtonian and rigid-body dynamics it governs, and the variational formalism that reorganises the whole. It did so in two languages held in step, and the promise made in the Introduction was that the second language would earn its place. Three anchors carried that promise, and each is now a debt discharged.
The first anchor was the catalogue of discomforts of the matrix language in Section 1.5. A rotation of the plane had no real eigenvalue, the axis of a spatial rotation was an accident of odd dimension, the cross product was a three-dimensional pseudovector that reflected with the wrong sign, and the determinant and orientation were bolted on from outside the algebra of vectors. The geometric product retired every one of them. The imaginary unit that the plane rotation demanded became the unit bivector squaring to of Section 1.8, already inside the real algebra of space; the axis gave way to the plane of rotation, defined in every dimension; the cross product was unmasked in Section 1.9 as the dishonest dual of the honest bivector ; and the determinant became, in Section 1.11, the scale by which an operator inflates the pseudoscalar, a product within the algebra rather than a sum imported from without.
The second anchor was the calculus, tallied in Section 2.5. The gradient, divergence, and curl, three operators of incompatible type with the curl confined to three dimensions, became the grade parts of one vector derivative ; and the four integral theorems of vector analysis, one welded to each dimension, became grade projections of the single fundamental theorem of geometric calculus. One operator and one theorem, defined in every dimension and needing no handedness convention, stood where the classical apparatus spread a zoo.
The third anchor was the dynamics, the rotor kinematics of Section 3.6. Orientation carried by Euler angles gimbal-locks, its equations dividing by a sine that vanishes when two axes align, and rotation matrices carry nine numbers under six constraints. The rotor law has no coordinates on the orientation and so no coordinate singularity anywhere; it preserves its own normalisation exactly, composes by multiplication, and integrates a genuine rotation for all time. The same rotor returned in Section 4.3 as the configuration of a variational principle, and the Euler equations fell out of the geometry of the variation with the commutator product standing where the cross product once stood.
What the second language bought, in the end, is a single change of vantage kept up for the length of the article: the oriented plane in place of the axis, the whole element of the algebra in place of its shadow in coordinates. The bivector that first appeared as the outer product of two vectors became the angular momentum of an orbit, the angular velocity of a spinning body, the generator of a rotor, and at the last the symplectic form on phase space that mechanics conserves most deeply. It was one object throughout, and following it rather than its components is the whole of what geometric algebra offered. The matrix language kept every formula computable, as it was built to; the geometric language kept every formula meaning what it said. Classical mechanics, constructed from an inertial frame and nothing else, is legible in both, and most legible where the two are read together.
References
- D. Hestenes, G. Sobczyk (1984). Clifford Algebra to Geometric Calculus. Reidel. DOI
- C. Doran, A. Lasenby (2003). Geometric Algebra for Physicists. Cambridge University Press. DOI
- D. Hestenes (1999). New Foundations for Classical Mechanics, 2nd ed. Kluwer Academic Publishers. DOI
- V. I. Arnold (1989). Mathematical Methods of Classical Mechanics, 2nd ed. Springer-Verlag. DOI
- J. E. Marsden, T. S. Ratiu (1999). Introduction to Mechanics and Symmetry, 2nd ed. Springer. DOI