Sussman is one of the very few classical mechanics textbooks that gives a reasonable definition of the Legendre transform. Most physicists cannot actually tell you what that transform is, even though it sits at the heart of both classical and quantum mechanics.
The description in this article is great, but the why is still rather mysterious. How would somebody come up with that?
If you are familiar with the method of Lagrange multipliers, then what's happening can be explained as follows. Given the Lagrangian L(x,v) the problem of classical mechanics is to find a trajectory x(t),v(t) that extremises the integral of L(x(t),v(t))dt under the constraint x'(t) = v(t). Lagrange multipliers are a method to deal with constraints in optimisation problems. Usually it's taught in the finite dimensional case, but it also works in the infinite dimensional case. We introduce a Lagrange multiplier p(t) and add the constraint to the objective: integral of L(x(t),v(t)) + p(t)(x'(t) - v(t)) dt. To solve the problem we minimise this over x,v,p. If we carry out the minimisation over v first then we're left with two variables x,p. That's the Hamiltonian formulation of the problem, and it's called the dual problem in convex optimisation. So the momentum p is the Lagrange multiplier for the constraint x' = v.
In more detail: we rewrite L(x(t),v(t)) + p(t)(x'(t) - v(t)) = L(x(t),v(t)) - p(t)v(t) + p(t)x'(t). Now we separate out H(x,p) = min_v L(x,v) - pv, so the original problem becomes to minimise the integral of H(x,p) + p(t)x'(t). After applying the Euler-Lagrange equations we obtain Hamilton's equations:
Agreed, thanks. That blog post was just trying to explain what it is, not why it is, mostly as a basis to complain about education rather than teach physics.
Christ, that's _awesome_. I wish I was taught the Legendre transform this way. Where did you run into this viewpoint? I'd love to read a textbook that explains physics this way.
In fact, I to this day have not found a book that caters to my level of mathematical knowledge (undergrad pure math) on variational calculus. They're either _way_ too handwavy (like Taylor), or _way_ beyond reach (relying on heavy functional analysis, weak convergence, all that stuff).
Is there a textbook/lecture series/what have you that explains the calculus of variations at a "non-handwavy" level, BUT does not tries to perform the calculations in the most general way possible. That is, pick a nice space (C^\infinity or some such) and then show me with full rigor how the calculus of variations can be constructed. I'd love some references :)
As someone who struggled dealing with physicists explanations in college I argue that much of the lack of clarity is a deliberate ambiguity as to the meaning of the underlying mathematical structure of the symbols you're informed to manipulate.
As an example, when I was taught calculus of variations, we didn't dwell too much on the fact the symbols in our functions we were differentiating were no longer corresponding to points in R^n, but rather to functions existing in function space. Why the mathematics we were taught in Vector Calculus worked here as well was quietly ignored, to the extent that when I raised the issue with classmates, they didn't realize that anything had changed from vector calculus.
As another example, is physicists playing fast and loose with what exactly they mean by 'vector' (hint: they mean it's really a function on R^3, but good luck finding an undergrad textbook that says that; I'm curious as to how Sussman treats this).
Sussman, by virtue of implementing programs, takes a constructive approach, and thus forces both the author and reader to come to an explicit understanding of what the underlying mathematical objects actually are and the structure of how solutions are formed. Thus, I am not surprised that he comes up with a superior and clearer definition of the Legendre transform.
Having tought classical mechanics I have to say, yes we do that (intentionally), but it is due to a lack of time and due to a lack of preparedness of the students. I have 16 weeks of instruction, with 2 lectures of 90 minutes each, to cover all the parts of classical mechanics that the students will need later. (Very little research is done in classical mechanics these days). A third of the class is deadly afraid of calculus. I simply can not spare 6 lectures to explain caculus of variatians, when I only need it for two derivations. After all it still is a a vector space and the math they know still works. I am happy to discuss the details (why does it work, what is a function space, what is the dual space of distributions) with the five students who care after class or during office hours.
I don't know how many times I read over the definition of the Legendre transform and never really got it. I'll definitely check out that section. I feel like as an outsider Sussman brings a much needed fresh viewpoint in explaining things in physics. His explanation of the derivatives in the Euler-Lagrange equation was similarly great.
I don't understand this mysticism. Legendre transformation is just one example of canonical transformations that you can do switch from one set of variables to another, typically out of convenience (you can rest assured physicists are aware of this, and I would be more conservative before saying it lies "at the heart" of anything --it just links Hamiltonian mechanics to Lagrangian mechanics, but they're basically two different ways we use to describe the same physical thing, each parametrization with their own intuitive features--, and it doesn't really show up in quantum mechanics).
Yes, you can go beyond the actual purpose of "I just want to get a function of x,y starting from a function of x,z" and do illustrative geometric interpretations for every single canonical transformation out there, but physics books tend be practical and not to dwell on such mathematical "curiosities".
Still, Legendre transform is arguably the most important one, and some books do go further. On top of my mind, you can see Analytical Mechanics by Hand & Finch, preceding that blog post by 2 decades. (Interestingly, he claims to have read this book --not carefully, apparently, because the equation he finds surprising is there; it's just written in words relating two equations, rather than as a new symbolic equation).
You mistakenly say the Legendre Transform doesn't really show up in quantum mechanics, which is likely because you didn't notice when it appeared in the construction of the path integral. (Or maybe you just haven't used quantum mechanics outside of the non-relativistic domain, in which case you've probably only used the Hamiltonian formulation and don't know why? I don't know your background.) That's an easy mistake to make if you think about it as just some coordinate transform, since then you won't notice it when it's used in a different guise than when it was taught to you. Many field theorists would say the path integral is the most beautiful idea in theoretical physics -- Noether's theorem probably being the only idea that is more popular -- and if you don't understand how it bakes in the Legendre transform you don't understand it well.
Other than that technical mistake, which I think is revealing, the rest of your comment is hard to respond to because it takes the form "this thing you claim is important for given reasons just isn't; it's not a big deal". So all I can do is try to flesh out my reasons, and hope you'll engage with them.
The Lagrangian and Hamiltonian formulations are the two primary ways to write down fundamental physical laws. There is no third formulation of remotely comparable importance or clarity, so the Legendre transform has a special role not enjoyed by other coordinate transforms. If you want to actually understand why laws are formulated how they are, and especially if you want to be prepared to move beyond them in case new physics requires it, you need to have a deep understanding of how the Lagrangian and Hamiltonian formulations are linked. Otherwise it's impossible to answer questions like: "If the Lagrangian formulation put space and time on equal footing, and if the Hamiltonian formulation gives a preferred role to time (generating time evolution), could we give a similarly preferred role to space?" "More generally, why isn't there a third or forth major formulation of mechanics?" "Where does the incredible richness of symplectic geometry come from, and why isn't there anything similar associated with the Lagrangian formulation?" "How would any of this change if there was more than one time dimension?"
Regardless, Hand & Finch is awful. I encourage you to quote the piece of Hand & Finch on the Legendre Transform that you think is clear, as I have quoted the books I think are confused. And then we can see whether students find my explanation or their explanation clearer.
If you say "well, the ideas are encoded in Hand & Finch even if the students don't understand them", you've missed the point. I make zero claims of novelty, and neither, I'm sure, would Hand & Finch. (These ideas are nearly two centuries old, so pointing out that their textbook is two decades old is silly.) My complaint, rather, is that these old ideas crucial to the invention of mechanics are not being faithfully transmitted to generations of students who take mechanics for granted.
I don’t think the typical “change of variables” definition is bad.
You take the derivative of L along the fiber of the tangent bundle.
If the derivative is non-singular it defines an isomorphism in each
point of the tangent space with the cotangent space. And that’s the
important thing, going from the tangent bundle to the cotangent
bundle. Now we can use all the beauty of symplectic geometry
The change of variable definition, as actually presented in the textbooks everyone teaches from, is horrible. That's the topic of the blog post. Yes, that definition can be made clear after introducing a bunch of machinery of symplectic geometry, but I'm doubtful this is good pedagogy and I'm confident that, due to time constraints, it could never be taught to most physicists.
Great summary. I also love Prof. V. Balakrishnan's exposition on this topic. The video is long, but he motivates the ideas very clearly and in context: https://www.youtube.com/watch?v=GOkZs2RZMQY.
I actually took this class a sophomore in CS and it was part of reason I switched to physics. It's not an easy text - there's a lot you have to keep in your head all at once - but the approach to mechanics is the most elegant I've seen (personal opinion though). It was the first class I took where I really had to read through the book and my notes after class, but the understanding you gain about classic mechanics, math, and functional programming are second to none. Really formative experience I'd say - well worth the effort!
As mentioned in a footnote in the blog post, I think Gelfand and Fomin’s “Calculus of Variations” is very clear on its topic. Wald's "General Relativity" is slow-going, but excellent. No good textbooks on quantum mechanics or QFT exist; Weinberg's "Quantum Theory of Fields" is probably the least terrible on that. Nielsen and Chuang's "Quantum Computation and Quantum Information" is still excellent, but pretty out-of-date.
Also, to be clear, I haven't worked through most of Sussman, so I can't recommend it one way or the other. I was just commenting on the handling of the Legendre transform.
I love this book! Prior to reading it, I had been getting confused when trying to learn classical mechanics. The book writes out everything explicitly in code, which let me use the software engineering part of my brain, and made everything easy to follow. Apparently I had been being held back by unfamiliarity with math formalisms!
If you're interested in the topic and know how to program, it might be worth a read in case it turns out you're in the same boat.
There is a racket port; I do not know how complete it is, though a quick look at the open issues suggests not all the pieces are there yet: https://github.com/bennn/mechanics
Looks like MIT Scheme (on which the ScmUtils system runs) still gets some regular maintainence releases, though I don't know how active development is these days:
I worked through SICM using sicmutils as a backup. The MIT Scheme version sometimes "locked up" on my solutions to exercises and sicmutils did not (the Foucault pendulum problem comes to mind).
This video is an introduction to SICM and sicmutils:
http://blog.jessriedel.com/2017/06/28/legendre-transform/
If you are familiar with the method of Lagrange multipliers, then what's happening can be explained as follows. Given the Lagrangian L(x,v) the problem of classical mechanics is to find a trajectory x(t),v(t) that extremises the integral of L(x(t),v(t))dt under the constraint x'(t) = v(t). Lagrange multipliers are a method to deal with constraints in optimisation problems. Usually it's taught in the finite dimensional case, but it also works in the infinite dimensional case. We introduce a Lagrange multiplier p(t) and add the constraint to the objective: integral of L(x(t),v(t)) + p(t)(x'(t) - v(t)) dt. To solve the problem we minimise this over x,v,p. If we carry out the minimisation over v first then we're left with two variables x,p. That's the Hamiltonian formulation of the problem, and it's called the dual problem in convex optimisation. So the momentum p is the Lagrange multiplier for the constraint x' = v.
In more detail: we rewrite L(x(t),v(t)) + p(t)(x'(t) - v(t)) = L(x(t),v(t)) - p(t)v(t) + p(t)x'(t). Now we separate out H(x,p) = min_v L(x,v) - pv, so the original problem becomes to minimise the integral of H(x,p) + p(t)x'(t). After applying the Euler-Lagrange equations we obtain Hamilton's equations:
dH/dx = dp/dt
dH/dp = -dx/dt
In fact, I to this day have not found a book that caters to my level of mathematical knowledge (undergrad pure math) on variational calculus. They're either _way_ too handwavy (like Taylor), or _way_ beyond reach (relying on heavy functional analysis, weak convergence, all that stuff).
Is there a textbook/lecture series/what have you that explains the calculus of variations at a "non-handwavy" level, BUT does not tries to perform the calculations in the most general way possible. That is, pick a nice space (C^\infinity or some such) and then show me with full rigor how the calculus of variations can be constructed. I'd love some references :)
Deleted Comment
As an example, when I was taught calculus of variations, we didn't dwell too much on the fact the symbols in our functions we were differentiating were no longer corresponding to points in R^n, but rather to functions existing in function space. Why the mathematics we were taught in Vector Calculus worked here as well was quietly ignored, to the extent that when I raised the issue with classmates, they didn't realize that anything had changed from vector calculus.
As another example, is physicists playing fast and loose with what exactly they mean by 'vector' (hint: they mean it's really a function on R^3, but good luck finding an undergrad textbook that says that; I'm curious as to how Sussman treats this).
Sussman, by virtue of implementing programs, takes a constructive approach, and thus forces both the author and reader to come to an explicit understanding of what the underlying mathematical objects actually are and the structure of how solutions are formed. Thus, I am not surprised that he comes up with a superior and clearer definition of the Legendre transform.
Yes, you can go beyond the actual purpose of "I just want to get a function of x,y starting from a function of x,z" and do illustrative geometric interpretations for every single canonical transformation out there, but physics books tend be practical and not to dwell on such mathematical "curiosities".
Still, Legendre transform is arguably the most important one, and some books do go further. On top of my mind, you can see Analytical Mechanics by Hand & Finch, preceding that blog post by 2 decades. (Interestingly, he claims to have read this book --not carefully, apparently, because the equation he finds surprising is there; it's just written in words relating two equations, rather than as a new symbolic equation).
Other than that technical mistake, which I think is revealing, the rest of your comment is hard to respond to because it takes the form "this thing you claim is important for given reasons just isn't; it's not a big deal". So all I can do is try to flesh out my reasons, and hope you'll engage with them.
The Lagrangian and Hamiltonian formulations are the two primary ways to write down fundamental physical laws. There is no third formulation of remotely comparable importance or clarity, so the Legendre transform has a special role not enjoyed by other coordinate transforms. If you want to actually understand why laws are formulated how they are, and especially if you want to be prepared to move beyond them in case new physics requires it, you need to have a deep understanding of how the Lagrangian and Hamiltonian formulations are linked. Otherwise it's impossible to answer questions like: "If the Lagrangian formulation put space and time on equal footing, and if the Hamiltonian formulation gives a preferred role to time (generating time evolution), could we give a similarly preferred role to space?" "More generally, why isn't there a third or forth major formulation of mechanics?" "Where does the incredible richness of symplectic geometry come from, and why isn't there anything similar associated with the Lagrangian formulation?" "How would any of this change if there was more than one time dimension?"
Regardless, Hand & Finch is awful. I encourage you to quote the piece of Hand & Finch on the Legendre Transform that you think is clear, as I have quoted the books I think are confused. And then we can see whether students find my explanation or their explanation clearer.
If you say "well, the ideas are encoded in Hand & Finch even if the students don't understand them", you've missed the point. I make zero claims of novelty, and neither, I'm sure, would Hand & Finch. (These ideas are nearly two centuries old, so pointing out that their textbook is two decades old is silly.) My complaint, rather, is that these old ideas crucial to the invention of mechanics are not being faithfully transmitted to generations of students who take mechanics for granted.
Also, to be clear, I haven't worked through most of Sussman, so I can't recommend it one way or the other. I was just commenting on the handling of the Legendre transform.
The notation chapter with live snippets is reproduced here: http://io.livecode.ch/learn/namin/scheme-mechanics/chapter9
Dead Comment
Personally, my favourite theorem in classical mechanics is the so called 'tennis racket theorem', sometimes known as the 'intermediate axis theorem'.
It explains why objects with roughly three different moments of intertia have unstable rotation about their intermediate moment.
It can be easily demonstrated with a tennis racket, or even most smartphones (be careful not to break it though).
https://en.wikipedia.org/wiki/Tennis_racket_theorem
https://www.youtube.com/watch?v=4dqCQqI-Gis
If you're interested in the topic and know how to program, it might be worth a read in case it turns out you're in the same boat.
https://en.wikipedia.org/wiki/Gerald_Jay_Sussman
2013: https://news.ycombinator.com/item?id=6947257
2010: https://news.ycombinator.com/item?id=1581696
The most recent version of `scmutils` I see is from 2016: https://groups.csail.mit.edu/mac/users/gjs/6946/linux-instal... and https://groups.csail.mit.edu/mac/users/gjs/6946/scmutils-tar...
Edit: There seems to be quite a few ports to various languages: https://github.com/search?q=scmutils No idea how many are fully featured/tested.
https://www.gnu.org/software/mit-scheme/https://ftp.gnu.org/gnu/mit-scheme/stable.pkg/
Installation instructions for the ScmUtils package:
http://groups.csail.mit.edu/mac/users/gjs/6946/index.html
As far as I know, Gerry and Jack still teach the course every year (can someone currently at MIT verify?) and still use the system.
I'm in his group at MIT.
I haven't done much with it recently but have plans in the back of my mind for when I have more free time.
I worked through SICM using sicmutils as a backup. The MIT Scheme version sometimes "locked up" on my solutions to exercises and sicmutils did not (the Foucault pendulum problem comes to mind).
This video is an introduction to SICM and sicmutils:
https://www.youtube.com/watch?v=7PoajCqNKpg