Feynman: "What I cannot build. I do not understand"
Einstein: "If you can't explain it to a six year old, you don't understand it yourself"
Of course none of this changes anything around the machine generated proofs. The point of the proof is to communicate ideas; formalization and verification is simply a certificate showing that those ideas are worth checking out.
Formalisation and (formulating) ideas are not separate things, they are both mathematics. In particular, it is not that one should live in Lean, and the other one in blueprints.
Formalisation and verification are not simply certificates. For example, what language are you using for the formalisation? That influences how you can express your ideas formally. The more beautiful your language, the more the formal counter part can look like the original informal idea. This capability might actually be a way to define what it means for a language to be beautiful, together with simplicity.
There are bugs in theorem provers, which means there might be "sorries", maybe even malicious ones (depending on what is at stake), that are not that easy to detect. Personally, I don't think that is much of a problem, as you should be able to come up with a "superlean" version of your theorem prover where correctness is easier to see, and then let the original prover export a proof that the superlean prover can check.
I think more of a concern is that mathematicians might not "understand" the proof anymore that the machine generated. This concern is not about the fact that the proof might be wrong although checked, but that the proof is correct, but cannot be "understood" by humans. I don't think that is too much of a concern either, as we can surely design the machine in a way that the generated proofs are modular, building up beautiful theories on their own.
A final concern might be that what gets lost is that humans understand what "understanding" means. I think that is the biggest concern, and I see it all the time when formalisation is discussed here on HN. Many here think that understanding is simply being able to follow the rules, and that rules are an arbitrary game. That is simply not true. Obviously not, because think about it, what does it mean to "correctly follow the rules"?
I think the way to address this final concern (and maybe the other concerns as well) is to put beauty at the heart of our theorem provers. We need beautiful proofs, written in a beautiful language, checked and created by a beautiful machine.
LeCun is right to say that continuous self supervised (hierarchical) learning is the next frontier, and that means we need world models. I'm not sure that JEPA is the right tool to get us past that frontier, but at the moment there are not a lot of alternatives on the table.
So the world of mathematics is really the only world model we need. If we can build a self-supervised entity for that world, we can also deal with the real world.
Now, you may have an argument by saying that the "real" world is simpler and more constrained than the mathematical world, and therefore if we focus on what we can do in the real world, we might make progress quicker. That argument I might buy.
Note that the final result of the Flyspeck project does not depend on that proof, as the linear inequalities part has later on been redone and extended in HOL-Light by Alexey Solovyev, using just the LCF kernel of HOL-Light. Which proves that using a simple LCF kernel can definitely be fast enough for such computations, even on that scale!
I don't think it's "basically the same", because this application of the rewrite rules in a LCF-like system is explicit (i.e. the proof checking work grows with the size of the problem), while in proof by reflection in a type theory it happens implicitly because the "rewriting" happens as part of reduction and makes use of with the definitional equality of the system?
For small and medium examples this probably doesn't matter, but I would think that for something like the four colour theorem it would.
I've used it for proving linear inequalities as part of the Flyspeck project (formal proof of the Kepler conjecture), and there I implemented my own rewrite engine for taking a set of rewrite rules and do the computation outside of the LCF kernel, for example by compiling the rules to Standard ML. You can view that engine as an extension of the LCF kernel, just one more rule of how to get theorems. In that instance, it is exactly the same.
You confuse spec and proof.
I think the difference is that in a type theory you can prove the soundness of the decision procedure to be correct within the system?
From "Metatheory and Reflection in Theorem Proving: A Survey and Critique" by John Harrison, 1995:
> "No work on reflection has actually been done in HOL, but Slind (1992) has made some interesting proposals. His approach is distinguished from those considered previously in two important respects. First, he focuses on proving properties of programs written in Standard ML using the formal semantics to be found in Milner, Tofte, and Harper (1990). This contrasts with the other approaches we have examined, where the final jump from an abstract function inside the logic to a concrete implementation in a serious programming language which appears to correspond to it is a glaring leap of faith. [...]"
Proving that your LCF-like tactics are sounds using the (informal) semantics of the tactic language (ML) seems cumbersome.
Furthermore I believe proof by reflection crucially relies on computation happening at the logical level in order to minimise proof checking. Harrison concludes:
> "Nevertheless it is not clear that reflection’s practical utility has yet been convincingly demonstrated."
This was from 1995, so fair enough, but Paulson should be aware of Gonthier's work, which makes me wonder if anything changed since then?
So this is not a matter of dependent or static typing or not, the idea is simple and the same (e.g., I've used it for my PhD thesis in Isabelle that is from 2008), it is just a matter of how practical this is to use in your theorem prover of choice.
That means, you have to understand if it is even proving the properties you require for the software to work.
It's very easy to write a proof akin to a test that does not test anything useful...
I think this is sort of how lean itself already works. It has a minimal trusted kernel that everything is forced through. Only the kernel has to be verified.