Terence Tao on proof checkers and AI programs

I know absolutely nothing about math, but... I'm thinking about the history of software, where back in the day you had all these amazing projects like RollerCoaster Tycoon basically being written by one guy. Then software engineering became modularized in a way that sounds kind of similar to what is described in the interview, and now you have hordes of people who churn out React for a living (like myself) and software engineering is now a giant assembly line and the productivity (or skill required) per individual has dropped to near zero.

I guess I have the impression that when fields are in their heyday, the best work is done by some genius fitting like a hundred things in one brain, and when that gets replaced by an assembly line, the field has basically stopped producing anything of real value. Anyone in math (or otherwise) want to tell me why that's completely wrong?

cayley_graph · 2 years ago

This intuition denies the inherent human power: that we are not merely individuals, and can work together to accomplish greater things than any individual could. Software engineering has certainly _not_ stopped doing cool things-- exactly the opposite-- obviously. I don't care to justify this since there was none in the claim I'm responding to.

_5tsv · 2 years ago

My theory is not that individual genius accomplishes more than a group, but that whether a field is currently more suited for individual single-brain productivity or assembly-line cooperation is a signal of whether that field is nascent, at its peak, or in decline. I genuinely think software technology has not advanced in a long time. Computers got really fast, which allowed us to build a mountain of abstractions on top, and now you can throw together web apps, and there is no shortage of companies running around with that hammer and banging on everything. But the actual boundaries of what is possible with software have not moved in a while.

Most of the software nowadays (in addition to being terribly slow and broken, by the way) feels like stuff that doesn't actually need to exist. It doesn't advance the boundaries of technology or humanity's skill tree in any way. It's a "nice to have" — it might have some social value or make some business process more efficient or whatever — but no one looks at it and goes, wow, we couldn't do that before.

Someone is going to bring up AI and LLMs, and about that, I will just say that ChatGPT was a real accomplishment... but after that, I have not seen any use cases of LLMs that represent a genuine step forward. Like smart contracts, it's a solution in desperate search of a problem, a technology in search of a breakout app. You feel like there's just no way it doesn't have wide reaching implications, and everyone around you (especially VCs) talks about the "possibilities unlocked" — but two years in and we haven't seen a single actually interesting use case. (Again, of course they're being incorporated to optimize some business process or whatever. But who cares?)

Meanwhile, as far as the day to day reality of software engineering goes, if you work professionally as an engineer, your job is more or less wrangling microservices in one form or another. If you're at a big company this is probably literal, but even if you're at a startup the bigco transplants start doing microservice stuff basically on arrival, and a bunch of the "best practices" and frameworks and tools are meant to mimic microservices type thinking (making things more complicated rather than less).

As a final note, I grew up being a giant fan of YC and PG's essays, so I will make a comment about the latest batch of YC startups. It seems to me that a bunch of them basically exist to make some process more efficient. But then you wonder, why does that process need to exist in the first place? Like there was one startup helping you to optimize your AWS costs or something, but why is software written in such a way to begin with that you have all these microservices and towers of abstractions and untrackable infrastructure? Like you look at these startups solving a problem of solving a problem... and then you get to the root problem, and the whole thing is just a giant castle in the sky. I think this describes much of the motivating thrust of software today.

sublinear · 2 years ago

> software engineering is now a giant assembly line and the productivity (or skill required) per individual has dropped to near zero

If you want a more engaging job you can take a paycut and work at a startup where the code is an even bigger mess written by people who are just as lazy and disorganized but in another (more "prestigious") programming language.

Joking aside, when inevitably someone needs to fix a critical bug you'll see the skills come out. When products/services become stable and profitable it doesn't necessarily mean the original devs have left the company or that someone can't rise to the occasion.

im3w1l · 2 years ago

I think that is unfair. The productivity per person in old projects like RollerCoaster Tycoon is impressive. But the overall result pales in comparison with modern games. Saying the field has stopped producing anything of real value is very wrong.

To me it's like comparing a cathedral made by 100 people over 100 years to a shack without made by one person in one month. Like it stands I suppose. It gives him shelter and lets him live. But it's no cathedral.

_5tsv · 2 years ago

I probably should've defined "real value" more carefully. My framework here is basically the idea of technology as humanity's "skill tree," the total set of stuff we're able to do. Back when RCT (and Doom before that) were written, we genuinely weren't sure if such things could be created, and their creation represented a novel step forward. The vibe in software feels like we're long past the stage of doing things like that, like we're not even attempting to do things like that. I can only think of two things in recent history: whatever strand of AI culminated in ChatGPT, which is genuinely impressive, and Bitcoin, which ostensibly was created by one guy.

bjornasm · 2 years ago

I have been thinking on something along these lines as well. with the professionalization/industrialization of software you also have an extreme modularization/specialization. I find it really hard to follow the web-dev part of programming, even if I just focus on say Python, as there are so many frameworks and technologies that serve each little cog in the wheel.

Viliam1234 · 2 years ago

It's interesting; I have the opposite feeling. Twenty years ago I assumed that as the field of computer science will grow, people will specialize more and more. Instead, I see a pressure first to be come a "full stack developer", then a "dev ops", and now with cloud everyone is also an amateur network administrator.

Meanwhile, everything is getting more complex. For example, consider logging: a few decades ago you simply wrote messages to standard output and maybe also to a file. Now we have libraries for logging, libraries to manage the libraries for logging, domain specific languages that describe the structure of the log files, log rotation, etc. And logging is just one of many topics the developer needs to master; there is also database modeling, library management, automated testing, etc. And when to learn how to use some framework like a pro, it gets throws away and something else becomes fashionable instead.

And instead of being given a clear description of a task and left alone to do it, what we have is endless "agile" meetings, unclear specifications that anyway change twice before you complete the task, Jira full of tickets with priorities but most tickets are the highest priority anyway.

So, the specialization of the frameworks, yes there is a lot of it. But with specialization of the developers, it seems the opposite is the trend. (At least in my bubble.)

nighthawk454 · 2 years ago

I'm inclined to agree. People seem to be getting hung up on 'no, _my_ React app is hard and challenging for me!' which completely misses the point. I think it's a combination of a few things:

- skill is generally log-normally distributed, so there's few exceptional people anyway (see also: sturgeon's law, 99% of everything is crap)

- smaller/earlier fields may self-select for extreme talent as the support infrastructure just isn't there for most to make it, leading to an unrealistically high density of top talent

- it's a fundamentally different thing to make a substantial impact than to cookie-cutter a todo list app by wrangling the latest 47 frameworks. in that, it's highly unlikely that the typical engineering career trajectory will somehow hit an inflection point and change gears

- presumably, from a cost/benefit perspective there's a strong local maxima in spewing out tons of code cheaply for some gain, vs a much more substantial investment in big things that are harder to realize gains (further out, less likely, etc). the more the field develops, the lower the skill floor gets to hit that local maxima. which therefore increases the gap between the typical talent and the top talent

- there's not a lot of focus on training top talent. it's not really a direct priority of most organizations/institutions by volume. most coders are neither aiming for or pushed towards that goal

All told, I think there's a bunch of systemic effects at play. It seems unlikely to expect the density of talent to just get better for no reason, and it's easier to explain why left unchecked the average talent would degrade as a field grows. Perhaps that's not even an issue and we don't need 'so many cooks in the kitchen', if the top X% can push the foundations enough for everyone else to make apps with. But the general distribution should be worth paying attention to bc if we get it wrong it's probably costly to fix

beryilma · 2 years ago

> _my_ React app is hard and challenging for me

I love the idea of "essential vs artificial complexity". React claiming to solve the essential complexity of state management, among others, creates bunch of artificial complexities. And software engineers are no better as a result and become masters of frameworks, but not masters of their arts. That is why the comparison to assembly line is apt. Doing assembly line work is still hard, but it is not software "engineering" in a certain definition of the word.

BarryMilo · 2 years ago

Skill required near zero? I'm approaching 15 years as a developer and React is still completely foreign to me. I'd need days full-time just to learn one version of it, not to mention every prerequisite skill.

And what do you mean by "real" value? Carrots have value because we can eat them, React apps have value because they communicate information (or do other things app do).

Or do you mean artistic value? We can agree, if you like, that SPAs are not as much "art" as RCT. But... who cares?

beryilma · 2 years ago

Mathematics or more generally academics is no panacea either. At least in engineering and applied math, there is a different assembly line of "research" and publications going on. We live in the area of enshitification of everything.

In both software and math, there are still very good people. But too far and between. And it is hard to separate the good from the bad.

Dead Comment

A working mathematician recently got GPT-4o to prove an (afaik) novel lemma while pursuing their research:

"My partner, who is mathematician, used ChatGPT last week for the first time to prove some lemmas for his research. He already suspected those lemmas were true and had some vague idea how to approach them, but he wasn't expert in this type of statement. This is the first time that he got correct and useful proofs out of the models.

[...]

These are the two lemmas. The first one is something that a collaborator of my partner noticed for small values of e in their computations. ChatGPT did not find the proof before being told to try Mobius functions.

https://chatgpt.com/share/9ee33e31-7cec-4847-92e4-eebb48d4ff...

The second looks a bit more standard to me, probably something that Mathematica would have been able to do as well. But perhaps that's just because I am more familiar with such formulas. Still, Mathematica doesn't give a nice derivation, so this is useful.

https://chatgpt.com/share/7335f11d-f7c0-4093-a761-1090a21579...

a_wild_dandan · 2 years ago

In a recent interview, Ben Goertzel mentioned having a PhD-level discussion with GPT-4 about novel math ideas cooked up by himself and a colleague. These things are becoming scary good at reasoning and heavyweight topics. As the ML field continues focusing on adding more System Two capabilities to complement LLMs' largely System One thinking, things will get wild.

UniverseHacker · 2 years ago

This is my experience as well- even the popular hype underestimates how creative and intelligent LLMs can be. Most discussions are reasoning about what it should or shouldn’t be capable of based on how we think they work, without really looking directly at what is can actually do, and not focusing on what it can’t do. They are very alien- much worse than humans at many things, but also much better at others. And not in the areas we would expect in both cases. Ironically they are especially good at creative and abstract thinking and very bad at keeping track of details and basic computation, nearly the opposite of what we’d expect from computers.

fovc · 2 years ago

I’m pretty sure that 2nd proof exists in some book or ebook somewhere. generating functionology was the one that came to mind.

Impressive recall, but not novel reasoning

williamcotton · 2 years ago

I’m pretty sure that LLMs can generalize a deep model and can create novel assemblages and are not just search engines.

The burden of proof would be on you to find the existing mathematical, erhm, proof.

Sniffnoy · 2 years ago

Sorry, what is this quoted from?

It's interesting because that first one is something I was discussing recently with my friend Beren (does he have an account here? idk). We were thinking of it as summing over ordered partitions, rather than over partitions with a factor of |tau|!, but obviously those are the same thing.

(If you do it over cyclically ordered partitions -- so, a factor of (|tau|-1)! rather than |tau|! -- you get 0 instead of (-1)^e.)

Here's Beren's combinatorial proof:

First let's do the second one I said, about cyclically ordered ones, because it's easier. Choose one element of the original set to be special. Now we can set up a bijection between even-length and odd-length cyclically ordered partitions as follows: If the special element is on its own, merge it with the part after it. If it's not on its own, split it out into its own part before the part it's in. So if we sum with a sign factor, we get 0.

OK, so what about the linearly ordered case? We'll use a similar bijection, except it won't quite be a bijection this time. Pick an order on your set. Apply the above bijection with the last element as the special element. Except some things don't get matched, namely, partitions that have the last element of your set on its own as the last part.

So in that case, take the second-to-last element, and apply recursively, etc, etc. Ultimately everything gets matched except for the paritition where everything is separate and all the parts are in order. This partition has length equal to the size of the original set. So if the original set was even you get one more even one, and if the original set was odd you get one more odd one. Which rephrased as a sum with a sign factor yields the original statement.

Would be interested to send this to the mathematician you're referring to, but I have no idea who that might be since you didn't say. :)

Tossrock · 2 years ago

this was from the current ACX Open Thread

krackers · 2 years ago

Woah that second proof is really slick. Is there a combinatorial proof instead of algebraic proof as to why that holds?

gjm11 · 2 years ago

Yes.

(HN comments aren't the best medium for writing mathematics. I hope this ends up reasonably readable.)

(d choose k) is the number of ways to pick k things from d.

(d choose k) k^2 is the number of ways to pick k things from d, and then colour one of them red and one of them (possibly the same one) blue.

(-1)^k (d choose k) k^2 is that (when k is even), or minus that (when k is odd).

So the identity says: Suppose we consider a set of d things, and we consider picking some of them (meaning 0 or more, though actually 0 is impossible; k is the number we pick) and colouring one of those red and one blue; then there are the same number of ways to do this picking an odd number of things as picking an even number.

Well, let's describe the process a bit differently. We take our set of d things. We colour one of them red. We colour one of them blue. Then we pick some set of the things (k in number), including the red thing and the blue thing (which may or may not also be the red thing). And the claim is that there are the same number of ways to do this with an even value of k as with an odd number.

But now this is almost trivial!

Once we've picked our red and blue things, we just have to decide which of the other things to pick. And as long as there's at least one of those, there are as many ways to pick an odd number of things as to pick an even number. (Fix one of them; we can either include it or not, and switching that changes the parity.)

So if d>2 then we're done -- for each choice of red and blue, there are the same number of "odd" and "even" ways to finish the job.

What if d<=2? Well, then the "theorem" isn't actually true.

d=1: (1 choose 0) 0^2 - (1 choose 1) 1^2 = -1.

d=2: (2 choose 0) 0^2 - (2 choose 1) 1^2 + (2 choose 2) 2^2 = 2.

Chinjut · 2 years ago

ChatGPT just spit out nonsense in the first example. Look at the sum in step 6, over τ ≤ τ. Is τ the bound variable of the summation or a free variable? Put another way, how many terms are in this summation over "τ ≤ τ"? And how does this relate to establishing either the LHS or RHS of part 3, to then conclude the other side? Nothing actually coheres. What happened here is that ChatGPT remembered the Möbius function for the partition lattice, regurgitated it* without providing proof, and then spit out some nonsense afterwards that looks superficially reasonable.

But establishing that Möbius function is essentially the whole ballgame! The question being asked is very nearly the same as just being asked to prove that the Möbius function has that form.

[*: The regurgitation also has a slight error, as μ(τ, σ) with τ a refinement of σ is actually (-1)^(|τ| - |σ|) * the product of (|τ restricted to c| - 1)! over each equivalence class c in σ. The formula given by ChatGPT matches this one when |σ| = 1, which is the important case for the proof at hand, but is incorrect more generally.]

----

For what it's worth, the desired fact can be shown just by elementary induction, without explicitly invoking all the Möbius function machinery anyway:

Let N(t, e) be the number of partitions of {1, ..., e} into t many equivalence classes [we will always take e to be a non-negative integer, but allow for t to be an arbitrary integer, with N(t, e) = 0 when t is negative]. The sum in question is of (-1)^t * t! * N(t, e) over all t. Refer to this sum as Sum(e). We wish to show that Sum(e) comes out to (-1)^e. Since Sum(0) = 1 trivially, what we need to show is that Sum(e + 1) = -Sum(e).

There are two kinds of partitions on the set {1, ..., e + 1}: Those which put e + 1 in an equivalence class all on its own and those which put it in the same equivalence class as some value or values in {1, ..., e}. From this, we obtain the recurrence N(t, e + 1) = N(t - 1, e) + t * N(t, e).

Accordingly, Sum(e + 1) = Sum of (-1)^t * t! * N(t, e + 1) over all t = Sum of (-1)^t * t! * N(t - 1, e) over all t, plus sum of (-1)^t * t! * t * N(t, e) over all t. The first of these two sub-sums can be reparametrized as the sum of (-1)^(t + 1) * (t + 1)! * N(t, e) over all t. Now recombining the two sub-sums term-wise, and keeping in mind (-1)^(t + 1) * (t + 1)! = (-t - 1) * (-1)^t * t!, we get Sum(e + 1) = sum of (-1)^t * t! * (-t - 1 + t) * N(t, e) over all t. As (-t - 1 + t) = -1, this comes to -Sum(e) as desired, completing the proof.

Chinjut · 2 years ago

The second ChatGPT example proof is mostly fine, except it failed to see that the claimed theorem isn't actually true at d = 2. (It writes "d ≥ 2" for a step in its reasoning where d ≥ 3 is actually required.)

Deleted Comment

drlemonpepper · 2 years ago

https://archive.is/Idouw

raphlinus · 2 years ago

The description of "project manager mathematicians" reminded me of "A letter to my old friend Jonathan" [1] and the followup [2] by Edsger Dijkstra from 1975. It was written as satire, basically criticizing the way that software was produced by showing how ridiculous it would be to produce mathematics the same way. But in some ways it was prescient. Of course, the main point of these documents is to critique intellectual property (particularly the absurdity of applying it to mathematical truth) but fortunately that aspect is not a significant concern of this push toward mechanization.

[1]: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/E...

[2]: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/E...

data_maan · 2 years ago

LLMs can do math. AFAIK I know this is the SOTA if you don't formalize it:

https://arxiv.org/abs/2301.13867

coderenegade · 2 years ago

This was fantastic. Thank you for sharing it!

pierrefermat1 · 2 years ago

Except it's a non sensical comparison? 99.999% of software is not living on the cutting edge and is just engineering implementation that can be managed.

tmsh · 2 years ago

Really insightful. What's missing though I think is that LLMs abstract things in ways that are increasingly super-human. Tao says: "It’s very easy to transform a problem into one that’s harder than into one that’s simpler. And AI has not demonstrated any ability to be any better than humans in this regard." But I think due to the way LLMs work there will be much higher-level insights that are possible. And that is the really exciting part.

Right now it's a helper. A fact checker. And doer of tedious things. Soon it will be a suggester of insights. Already LLMs compress embeddings and knowledge and have insights that we don't see.

Hinton gives the example of how a nuclear bomb and a compost heap are related and that most humans would miss that: https://www.youtube.com/watch?v=Gg-w_n9NJIE&t=4613s

Salgat · 2 years ago

The biggest problem with LLMs are that they are simply regressions of writings of millions of humans on the internet. It's very much a machine that is trained to mimic how humans write, nothing more.

The bigger concern is that we don't have the training data required for it to mimic something smarter than a human (assuming that a human is given sufficient time to work on the same problem).

I have no doubt ML will exceed human intelligence, but the current bottleneck is figuring out how to get a model to train itself against its own output without human intervention, instead of doing a regression against the entire world's collective writings. Remember, Ramanujan was, with no formal training and only access to a few math books, able to achieve brilliant discoveries in mathematics. In terms of training data used by ML models, that is basically nothing.

mofeien · 2 years ago

LLMs are increasingly trained on multimodal data besides human written text, such as videos showing complex real-world phenomena that humans may also observe, but not understand, or time series data that humans may observe but be incapable of predicting. I don't see a fundamental reason why the LLM that is trained on such data may not already become a superhuman predictor of e. g. time-series data, with the right model architecture, amount of data, and compute.

In a way, next-token prediction is also a time-series prediction problem, and one that is arguably more difficult than producing the time-series/text in the first place. That is since to predict what a human will say about a topic, you don't just need to model the topic, but also the mental processes of the human talking about it, adding another layer of abstraction.

Right, I think the insight that LLMs "create" though is via compression of knowledge in new ways.

It's like teaching one undergraduate (reading some blog on the internet and training on next word prediction), then another undergraduate (training on another blog), then another one, etc. And finding a way to store what's in common among all these undergraduates. Over time, it starts to see commonalities and to become more powerful than a human. Or a human who had taught a million undergraduates. But it's not just at the undergraduate level. It's the most abstract/efficient way to represent the essence of their ideas.

And in math that's the whole ball game.

zozbot234 · 2 years ago

> the current bottleneck is figuring out how to get a model to train itself against its own output without human intervention

Isn't that what AlphaZero was known for? After all, the 'zero' part of the name was a play on 'zero-shot learning' which is the concept you're talking about. So we have a proof of existence already, and there's been recent papers showing progress in some mathematical domains (though simplified ones so far, such as Euclidean geometry - which does however feature complex proofs).

mistrial9 · 2 years ago

> but the current bottleneck is figuring out how to get a model to train itself against its own output without human intervention

this is sort of humorous, because the entire current generation of Transformer-based encoders was built specifically to get "a model to train itself without human intervention"

ref. arXiv:2304.12210v2 [cs.LG] 28 Jun 2023 * A Cookbook of Self-Supervised Learning

qnleigh · 2 years ago

Yes, thank you!! I have been trying to make this point for years. It seems self-evident to me, but it's hard to convince people of this.

But I think it's worth mentioning that what we have is already useful for research and making people smarter. Even Terence Tao has blogged that there were times where an LLM suggested an approach to a problem that hadn't occurred to him. This also happened to me once, enabling me to solve a problem that I'd failed to figure out on my own.

hnthrowaway0328 · 2 years ago

Does that mean ChatGPT is able to prove something, or summarize something, simply because some training materials prove or summarize similar things? What if the thing I asked it to summarize never existed in the training material? Is it capable to find similar materials and apply the same summarization?

rdevsrex · 2 years ago

Ramanujan attributed his success in no small part to divine assistance. I wonder if sentient AI would do the same :)

calf · 2 years ago

Hinton's example doesn't resonate for me -- I do gardening as a hobby, so my immediate answer to his "What is common to nukes and compost heap?" -- was the concept of critical mass. It took me 10 seconds to articulate the word, so I'm not as fast, but I still got the "correct" analogy right away, because with the right information the answer is obvious.

Hinton says this demonstrates analogical thinking. I read about gardening online and I've probably read about compost heap physics written down somewhere online.

So what is more likely happened was that ChatGPT was previously trained some article about compost heap chain reactions, because there's a lot of soil science and agriculture that discusses this, even for laypeople like myself that I've already read, so that it isn't a long reach if you know enough about both topics.

Thus Hinton's example strikes me as more having failed to control for LLMs having already basically seen the answer before in its training texts.

EDIT: A few minutes later in the video, Ilya comments (https://youtu.be/Gg-w_n9NJIE?t=4966): "We have a proof of existence, the human brain is a neural network" (and the panelist agrees, assuming/barring any computational "magic" happening in the human brain. I'm interested in this comment, because I generally hold that position as well, but I see many naysayers and dissenters who will respond "Well but real neurons don't run on 8 bits, neurons have different types and cells and contain DNA in a chemical soup of hormones, etc. etc., you are just confusing the model for the real thing in a backwards way", and so on. So I think there's an interesting divide in philosophy there as well, and I don't personally have an articulate answer to explain why I might hold the position that natural brains are also neural networks other than an appeal to computationalism.

29athrowaway · 2 years ago

If one AI based system becomes an expert in writing math proofs, the cost of getting another one is simply copying files and provisioning them into different cloud instances. This can be done within minutes to hours.

If one human becomes an expert in writing math proofs, the cost of getting another one is waiting for a human within the human population that likes math as a career, the cost of waiting for that human to finish school (decades) and getting advanced specialization, with no guarantee that the human will finish school or become proficient enough to advance the frontier of knowledge. By the time you finished waiting for this, you could have trillions of AI experts working in parallel.

The bandwidth of the brain when it comes to assimilating new information is low (up to hundreds of words per minute). Machines will be able to clone a lifetime of knowledge in seconds, have thousands of conversations in parallel, even serializing parts of their "brain" and sending them over to other AIs, like the Borg. These are things that are not possible within the human condition.

And once programmable matter is attained, you could produce exponential amounts of computronium and do the equivalent of millennias of research in seconds. This is how the omega point could happen.

raincole · 2 years ago

> Right now it's a helper. A fact checker. And doer of tedious things.

One of these is not like the others.

Tarean · 2 years ago

Computer-Checked proofs are one area where ai could be really useful fairly soon. Though it may be closer to neural networks in chess engines than full llm's.

You already use tons of solvers because proving everything by hand is mind numbingly tedious and time consuming, but tactics/solvers really struggle if you throw too many theorems and lemmas at them. Neural nets as search machines to pattern match relevant lemmas fit perfectly. Similarly induction and higher order unification are full-on code synthesis where brute-force iterating over all possible syntax trees is really inefficient.

And solvers do a backtracking solve anyway, so if the ai returns 95% irrelevant lemmas it doesn't matter. Still would be a dramatic improvement over manually searching for them.

Though I'm not convinced that computer checked proofs are necessarily good for communication. There are reasons beside page-count that proofs for human consumption are high-level and gloss over details.

qsort · 2 years ago

Not a mathematician, but knowing how proof-checkers work I don't expect them to become practical without some significant future advancement, they certainly aren't right now.

The main problem is that in order to work they have to be connected to the formal definition of the mathematical objects you are using in your proof. But nobody thinks that way when writing (or reading!) a proof, you usually have a high-level, informal idea of what you are "morally" doing, and then you fill-in formal details as needed.

Computer code (kinda) works because the formal semantics of the language is much closer to your mental model, so much that in many cases it's possible to give an operational semantics, but generally speaking math is different in its objectives.

g15jv2dp · 2 years ago

You're commenting on a thread about an interview where an actual Fields medalist explains how proof assistants are actually used to do meaningful mathematical work, today. The issues you raise are all discussed in the interview.

jvanderbot · 2 years ago

You yourself identified some problems that a modern AI guru would salivate over: Formal specifications from natural language, "filling in" formal details. etc

I'll add another one: NN-based speedups to constraint solvers, which are usually something like branch-and-bound. This is (i've heard) a very active area of research.

The whole article is about Terence Tao predicts it will be practical without a significant breakthrough.

>> I don’t think mathematics will become solved. If there was another major breakthrough in AI, it’s possible, but I would say that in three years you will see notable progress, and it will become more and more manageable to actually use AI.

He think you need another breakthrough to solve mathematics, i.e. making mathematicians obsolete. But even a major breakthrough doesn't happen, AI will be useful for mathematicians in 3 years.

Of course you can still disagree with Terence Tao. He being one of the best mathematicians doesn't make him an oracle.

> But nobody thinks that way when writing (or reading!) a proof

He even very specifically addressed this issue with:

>> Let’s say an AI supplies an incomprehensible, ugly proof. Then you can work with it, and you can analyze it. Suppose this proof uses 10 hypotheses to get one conclusion—if I delete one hypothesis, does the proof still work?

He clearly believes an ugly, incomprehensible proof is much better than no proof at all.

cjfd · 2 years ago

Actually, it is possible to do a high level outline of a proof development and later fill in the details. This is done by starting out with axioms. I.e., don't define the basic types but state that they exist as axioms and also write down their basic properties as axioms. Then later, when you have some development going on, fill in the blanks by turning the axioms into definitions.

bramhaag · 2 years ago

Automated theorem proving is a concept that has been around for a long time already.

Isabelle's sledgehammer strategy combines automatic theorem provers (E, Z3, SPASS, Vampire, ...) and attempts to prove/refute a goal. In theory this seems fine, but in practice you end up with a reconstructed proof that applies 12 seemingly arbitrary lemmas. This sort of proof is unreadable and very fragile. I don't see how AI will magically solve this issue.

gowld · 2 years ago

What is a "fragile" proof?

trott · 2 years ago

> Computer-Checked proofs are one area where ai could be really useful fairly soon. Though it may be closer to neural networks in chess engines than full llm's.

Yes, if significant progress is made in sample-efficiency. The current neural networks are extremely sample-inefficient. And formal math datasets are much smaller than those of, say, Python code.

munchler · 2 years ago

Terence Tao also gave a great talk on this topic a few months ago, where he goes into more detail about uses of Lean: https://www.youtube.com/watch?v=AayZuuDDKP0

CloseChoice · 2 years ago

3 points stand out to me in this interview:

- Project manager mathematicians: Tao draws a future where mathematical insights are "produced" like anything else in our society. He attributes the lack of this productionalization of mathematics to a lack of tools in this area but AI and proof assistants might be revolutionary in this regard (proof assistants already are). Human interaction and guidance still needed

- Implicit Knowledge: he points out that so much knowledge is not in papers, e.g. intuition and knowledge of failures. This is crucial and makes it necessary even for top mathematicians to talk to one another to not make mistakes again.

- Formalization of mathematics: one would think that mathematics is pretty formalized already (and it is) but in the papers a lot of common knowledge is taken for granted so having proofs formalized in such a way that proof assistants can understand will help more people actually understanding what is going on.

I think this just shows how Tao always searches for new ways to do mathematical research, which I first came across in his talk about one of the polymath projects.