Peano arithmetic is enough, because Peano arithmetic encodes computation

btilly · 9 months ago

This is a stack overflow question that I turned into a blog post.

It covers both the limits of what can be proven in the Peano Axioms, and how one would begin bootstrapping Lisp in the Peano Axioms. All of the bad jokes are in the second section.

Corrections and follow-up questions are welcome.

doodlebugging · 9 months ago

After putting on my boots and wading through all of that I think that you have one edit to make.

In the "Why Lisp?" section there is a bit of basic logic defined. The first of those functions appears to have unbalanced parentheses.

>(defun not (x) > (if x > false > true)

I have a compulsion that I can't control when someone starts using parentheses. I have to scan to see who does what with who.

You later say in the same section

>But it is really easy to program a computer to look for balanced parentheses

Okay. This is pretty funny. Thanks for the laugh. I realize that you weren't doing that but it still is funny that you pointed out being able to do it.

This later comment in the "Basic Number Theory" section was also funny.

>; After a while, you stop noticing that stack of closing parens.

I really enjoyed reading this post. Great job. Though it has been a long time since I did anything with Lisp I was able to walk through and get the gist again.

btilly · 9 months ago

Thank you for the correction!

And I'm glad that someone liked some of my attempts at humor. As my wife sometimes says, "I know that you were trying to make a joke, because it wasn't funny."

kazinator · 9 months ago

> I have to scan to see who does what with who.

Are you saying that parentheses introduce the problem of having to scan to see what goes with what?

As in, if we don't have parentheses, but still have recursive/nested structure, we don't have to scan?

Cheyana · 9 months ago

Thanks for this. In another strange internet coincidence, I was asking ChatGPT to break down the fundamentals of the Peano axioms just yesterday and now I see this. Thumbs up!

burnt-resistor · 9 months ago

I suspect in the near future (if not already) ChatGPT data will be sold to data brokers and bough by Amazon such that writing a prompt will end up polluting Alexa product recommendations within a few minutes to hours.

im3w1l · 9 months ago

Well there was a post on mathmemes a day ago about teaching kids set theory as a foundation for math with some discussion of PA. So maybe related ideas are echoing across the intertubes in this moment?

Art9681 · 9 months ago

I've noticed this too. I will be researching a topic elsewhere and then it seems to pop up in HN. Am I just looking for patterns where there are none, or is there some trickery happening where HN tracks those activities and mixes in posts more relevant to my interests with the others?

RcouF1uZ4gsC · 9 months ago

This proves that ChatGPT sells your data to HN which then decides which posts to put on the front page.

edanm · 9 months ago

This is fascinating! I haven't read much past the intro yet, but I find the whole premise that you can prove all specific instances of Goodstein sequences terminate at 0 within PA, but not that all sequences terminate (it's a trivial result but still interesting).

I also find it super weird that Peano axioms are enough to encode computation. Again, this might be trivial if you think about it, but that's one self-referential layer more than I've thought about before.

One question for you btilly - oddly enough, I just recently decided to learn more Set Theory, and actually worked on an Intro to Set Theory textbook up to Goodstein sequences just last week. I'm a bit past that.

Do you have a good recommendation for a second, advanced Set Theory textbook? Also, any recommendation for a textbook that digs into Peano arithmetic? (My mini goal since learning the proof for Goodstein sequences is to work up to understanding the proof that Peano isn't strong enough to prove Goodstein's theorem, though I'll happily take other paths instead if they're strongly recommended.)

btilly · 9 months ago

My apologies for having missed this request.

I don't have good suggestions for a good set theory textbook. Grad school was 30 years ago, and I didn't specialize in logic.

The best set theory book that I read was https://www.amazon.com/Naive-Theory-Undergraduate-Texts-Math.... But that one is aimed at people who want to go into math but do not wish to specialize in set theory, and not at people who want to actually learn set theory.

anthk · 9 months ago

Boot sector Lisp bootstraps itself.

https://justine.lol/sectorlisp2/

Also, lots of Lisp from https://t3x.org implement numerals (and the rest of stuff) from cons cells and apply/eval:

'John McCarthy discovered an elegant self-defining way to compute the above steps, more commonly known as the metacircular evaluator. Alan Kay once described this code as the "Maxwell's equations of software". Here are those equations as implemented by SectorLISP:

ASSOC EVAL EVCON APPLY EVLIS PAIRLIS '

Ditto with some Forths.

Back to T3X, he author has Zenlisp where the meta-circular evaluation it's basically how to define eval/apply and how to they ared called between themselves in a recursive way.

http://t3x.org/zsp/index.html

btilly · 9 months ago

I knew that this sort of stuff was possible, but it is fun to see it.

When it comes to bootstrapping a programming language from nothing, the two best options are Lisp and Forth. Of the two, I find Lisp easier to understand.

Timwi · 9 months ago

> Corrections and follow-up questions are welcome.

There are two places where you accidentally wrote “omega” instead of “\omega”.

btilly · 9 months ago

Thanks!

I am away from my computer for the day, but I will fux it later.

btilly · 9 months ago

It is now fixed. :-)

Deleted Comment

ikrima · 9 months ago

Hey! This is fantastic and actually ties in some very high disparate parts of math. Basically, reorient & reformulate all of math/epistomology around discrete sampling the continuum. Invert our notions of Aleph/Beth/Betti numbers as some sort of triadic Grothendieck topoi that encode our human brain's sensory instruments that nucleate discrete samples of continuum of reality (ontology)

Then every modal logic becomes some mapping of 2^(N) to some set of statements. The only thing that matters is how predictive they are with some sort of objective function/metric/measure but you can always induce an "ultra metric" around notions of cognitive complexity classes i.e. your brain is finite and can compute finite thoughts/second. Thus for all cognition models that compute some meta-logic around some objective F, we can motivate that less complex models are "better". There comes the ultra measure to tie disparate logic systems. So I can take your Peano Axioms and induce a ternary logic (True, False, Maybe) or an indefinite-definite logic (True or something else entirely). I can even induce bayesian logics by doing power sets of T/F. So a 2x2 bayesian inference logic: (True Positive, True Negative, False Positive, False Negative)

Fun stuff!

Edit: The technical tldr that I left out is unification all math imho: algebraic topology + differential geometry + tropical geometry + algebraic analysis. D-modules and Microlocal Calculus from Kashiwara and the Yoneda lemma encode all of epistemology as relational: either between objects or the interaction between objects defined as collision less Planck hyper volumes.

basically encodes the particle-wave duality as discrete-continuum and all of epistemology is Grothendieck topoi + derived categories + functorial spaces between isometry of those dual spaces whether algebras/coalgebra (discrete modality) or homologies/cohomologies (continuous actions)

Edit 2: The thing that ties everything together is Noether's symmetry/conserved quantities which (my own wild ass hunch) are best encoded as "modular forms", arithmetic's final mystery. The continuous symmetry I think makes it easy to think about diffeomorphisms from different topoi by extracting homeomorphisms from gauge invariant symmetries (in the discrete case it's a lattice, but in the continuous we'd have to formalize some notion of liquid or fluid bases? I think Kashiwara's crystal bases has some utility there but this is so beyond my understanding )

ricardobeat · 9 months ago

> Invert our notions of Aleph/Beth/Betti numbers as some sort of triadic Grothendieck topoi that encode our human brain's sensory instruments that nucleate discrete samples of continuum of reality (ontology)

There’s probably ten+ years of math education encoded in this single sentence?

ikrima · 9 months ago

You know what, since you put in all that work, here's my version using p-adic geometry to generalize the concept of time as a local relativistic "motive" (from category theory) notion of ordering (i.e. analogous to Grothendieck's generalization of functions as being point samples along curves of a basis of distributions to generalize notions of derivatives):

https://github.com/ikrima/topos.noether/blob/aeb55d403213089...

dooglius · 9 months ago

The interesting part to me (I have a background in both math+programming) isn't so much the encoding of computation but that one can work around the independence of goodstein's theorem in this self-referential way. I think this implies that PA+"PA is omega-consistent" can prove goodstein's theorem, and perhaps can more generally do transfinite induction up to epsilon_0? EDIT: I think just PA+"PA is consistent" is enough?

Kotlopou · 9 months ago

Asker of the SO question here: I edited the question to link to a few other answers on this kind of thing. Essentially, "PA is consistent" is not enough, but a "uniform reflection principle" that says "if PA proves something, it's true" is enough. I'm not 100% certain that this principle is equivalent to omega-consistency, but if I'm reading this correctly, it should be:

https://en.wikipedia.org/wiki/%CE%A9-consistent_theory#Relat...

The Wikipedia article says T is omega-consistent if "T + RFN_T + the set of all true sentences is consistent", which should mean the same thing as "T + RFN_T is true".

Deleted Comment

codeflo · 9 months ago

I also like the recursion. In essence, you're making a meta-proof about what PA proves, and given that you trust PA, you also trust this meta-proof.

> I think just PA+"PA is consistent" is enough?

It's not clear to me how. I believe PA+"PA is consistent" would allow a model where Goodstein's theorem is true for the standard natural numbers, but that also contains some nonstandard integer N for which Goodstein's theorem is false. I think that's exactly the case that's ruled out by the stronger statement of ω-consistency.

bubblyworld · 9 months ago

Unfortunately not, and apparently no other purely universally quantified formulas will do either (so this is a more general thing, not specific to Con(PA)): https://math.stackexchange.com/questions/5003237/can-goodste...

On the first question, how do you encode omega-consistency as a formula of PA? Just curious, it's not at all obvious to me.

LegionMammal978 · 9 months ago

> On the first question, how do you encode omega-consistency as a formula of PA? Just curious, it's not at all obvious to me.

I was also wondering about that, but going by the Wikipedia definition, it doesn't seem too complicated: you say, "For all encoded propositions P(x): If for every x there exists an encoded proof of P(x), then there does not exist an encoded proof of 'there exists an x such that ¬P(x)'." That is, if you can prove a proposition for each integer in the metatheory, then quantifiers within the target theory must respect that.

NoahZuniga · 9 months ago

I think so, the math exchange post mentions that the PA + transfinite induction works on epsilon_0 proves PA. It seems likely to me that PA + PA is consistent would be able to prove transfinite induction on epsilon_0.

btilly · 9 months ago

Now we're getting a little beyond the detail that I feel comfortable making statements about.

ChatGPT tells me that PA+"PA is consistent" is not quite enough. I believe that it has digested enough logic textbooks that I'll believe that claim.

Huxley1 · 9 months ago

I was pretty amazed by how expressive Peano Arithmetic is when I first started using it. At first it just seems like a basic system, but once you realize that computation itself can be encoded inside PA, and that it can simulate many kinds of computation, things that felt complicated start to click.

Does anyone have recommendations for beginner-friendly resources that explain these encoding techniques?

btilly · 9 months ago

Anything by https://en.wikipedia.org/wiki/Gregory_Chaitin, for example https://www.amazon.com/Exploring-Randomness-Discrete-Mathema....

Animats · 9 months ago

This is very like Boyer-Moore theory[1], which builds up mathematics from the Peano axiom level.

Boyer and Moore wrote an automated theorem prover that goes with this theory. I have a working copy for GNU Common LISP at [2].

"It is perhaps easiest to think of our program much as one would think of a reasonably good mathematics student: given the axioms of Peano, he could hardly be expected to prove (much less discover) the prime factorization theorem. However, he could cope quite well if given the axioms of Peano and a list of theorems to prove (e.g., “prove that addition is commutative,” . . . “prove that multiplication distributes over addition,” . . . “prove that the result returned by the GCD function divides both of its arguments,” . . . “prove that if the products over two sequences of primes are identical, then the two sequences are permutations of one another”)." - Boyer and Moore

[1] https://www.cs.utexas.edu/~boyer/acl.pdf

[2] https://github.com/John-Nagle/nqthm/tree/master

dwohnitmok · 9 months ago

Your comment to JoJoModding in Math Stackexchange is incorrect.

"That's because there are nonstandard models of PA which contain infinite natural numbers. So PA may be able to prove that it produces a proof, but can't prove that the proof is of finite length. And therefore it might not be a valid proof."

If PA proves "PA proves X" then PA can prove X. This is because the key observation is not that there are nonstandard models, but rather that the standard natural numbers model PA.

Therefore if PA proves "PA proves X", then there is in fact a standard, finite natural number that corresponds to the encoded proof of "PA proves X". That finite natural number can be used then to construct a proof of X in PA.

dwohnitmok · 9 months ago

I haven't had the time to go through your argument in more detail, but it's important to note (because the natural language version of what you've presented is ambiguous) that you haven't shown "PA proves 'Provable(forall n, G(n))'" in which case it would be the case that in fact "PA proves 'forall n, G(n)'", but rather "PA proves 'forall n, Provable(G(n))'".

My logic is very rusty at this point, but if someone could give me an argument that you cannot move the 'Provable' outside the 'forall', I would really appreciate that, without making reference to Goodstein sequences. In other words, that in general for propositions 'P' it is not true that proving "forall n, Provable(P(n))" implies you can prove "Provable(forall n, P(n))".

btilly · 9 months ago

> If PA proves "PA proves X" then PA can prove X.

Not true.

From PA we can construct a function that can search all possible proofs that can be constructed in PA. In fact I outlined one way to do this at the end of my answer.

With this function, we can construct a function will-return that analyzes whether a given function, with a given input, will return. This is kind of like an attempted solution to the Halting Problem. We know that it doesn't always work. But we also know that it works a lot of the time.

From will-return we can create a function opposite-return that tries to return if a given function with a given input would not, and doesn't return if that function would. This construction is identical to the one in the standard proof of the Halting Problem.

Now we consider (opposite-return opposite-return opposite-return). (Actually you need a step to expand the argument into this recursive form. I've left that out, but that is identical to the one in the standard proof of the Halting problem.)

PA can prove the following:

- PA proves that if PA can prove that opposite-return returns, then it doesn't. - PA proves that if PA can prove that opposite-return doesn't return, then it does. - PA proves that if it can prove everything that it proves that it can prove, then PA must have a proof of one of the two previous statements. - Therefore PA proves that if it can prove everything that it proves that it can prove, then PA is inconsistent.

This is a form of Gödel's second incompleteness theorem.

And, therefore, there must be a distinction to be made between "PA proves" and "PA proves that it proves".

ComplexSystems · 9 months ago

I am not sure I follow and it is possible I may not be on the same wavelength, but here is another thought.

For any sentence S, if first-order PA (or any first-order theory) proves S, then that means that S holds true in every model of that theory, via the completeness (not incompleteness) theorem.

The statement "PA proves x" is equivalent to saying "there exists a natural number N which is a Gödel number encoding a proof of x." The letter "x", here, is a natural number that is assumed to encode some sentence we want to prove, that is, x = #S for some sentence S.

The above is a predicate with one existential quantifier: Provable(x) = there exists N such that IsProof(N, x) holds true, where IsProof says that N is the Gödel number of a valid proof ending in x.

If PA proves "Provable(x)", that means that the predicate "Provable(x)" holds in every model of PA. This means every model of PA would have some natural number N that satisfies IsProof(N, x). Of course, this number can vary from model to model.

However: the standard model, which has only standard natural numbers, is also another model of PA. So if PA proves "Provable(x)," and Provable(x) thus holds in every model, it must also hold in the standard model of PA. This means that there must be a regular, standard, finite natural number N_st that encodes a proof of x.

Since the standard model is an initial segment of every other model of PA, then every other model includes the standard model and all the standard natural numbers. Thus, if PA proves Provable(x), then the standard number N_st must exist in all models.

So we cannot have a situation where PA proves Provable(x) but all proofs of x must be nonstandard and infinite in length. This would mean no such proof exists in the standard model of PA - but then, via completeness, PA would not be able to prove "Provable(x)".

tromp · 9 months ago

> > If PA proves "PA proves X" then PA can prove X.

If we assume PA to be sound, then indeed everything it proves is true.

> Not true.

Now you're saying PA is unsound.

But your article wasn't about PA proves "PA proves X", it was about "forall n : PA proves G(n)".

For PA not to prove "forall n: G(n)", there is no soundness issue, only a ω-consistency issue.

dwohnitmok · 9 months ago

> PA proves that if it can prove everything that it proves that it can prove, then PA must have a proof of one of the two previous statements.

I don't believe this is true. I don't know what result you're using here, but I think you're mixing up "provable" and "true".

In particular your line of reasoning violates Lob's Theorem, which is a corollary of the second incompleteness theorem.

bubblyworld · 9 months ago

The standard model is a model of PA only if PA is consistent, which it cannot prove unless it is inconsistent (Godel's theorem). So your proposed proof doesn't work in PA itself, which is the point of that comment, I believe.

dwohnitmok · 9 months ago

I believe my proof actually works in systems even weaker than PA (but the metatheory of how exactly to do the encoding in a first-order theory is a bit of a headache for me at the moment).

In particular I'm not relying on the consistency of PA. If PA is inconsistent, "If PA proves 'PA proves X' then PA can prove X" also holds (trivially), because then PA can prove anything.

gsf_emergency_2 · 9 months ago

https://math.stackexchange.com/questions/4408124/what-does-t...

gugagore · 9 months ago

I was talking to someone about inductive data types, and showed them the zero/succ definition of `Nat`, e.g. in Lean or Rocq.

It was interesting because they were asking "is this all you need? What about the Peano axioms? Is there anything more primitive than the inductive data type?"

I bring it up because it's good not to take for granted that the Peano axioms are inherent, instead of just one design among many.

SabrinaJewson · 9 months ago

> Is there anything more primitive than the inductive data type?

I believe that the natural numbers are more primitive than inductive data types, since all inductive data types may be constructed from the natural numbers alongside a small number of primitive type formers (e.g. Π, Σ, = and Ω).

gugagore · 9 months ago

You don't need all the natural numbers for that, though. I think you need 0 and 1 only?

I think there are two primitive sets for dependent type theory. The one with omega, and then the one with inductive types. None of them need axioms like the Peano axioms.

cryptonector · 9 months ago

Pure lambda calculus is enough because lambda calculus encodes computation.