Lean 4 is an interactive theorem prover. It's also a programming language with a self-hosting compiler. This is a free book on using Lean 4 as a programming language, written without assuming any background in functional programming. It's intended to be accessible to Python, C#, Rust, Kotlin, Java, TypeScript, and Scala developers. Today marks the final release, after more than a year of writing.
Thank you very much for this book. It has been invaluable for my introduction to Lean and a joy to study from. I still have not read the last chapter, looking forward to it! Kudos for your effort, and I hope you will write some more stuff about Lean in the future.
Thank you for reading it, and I hope that the final chapter is also enjoyable for you. Right now, I plan to take a break - this book occupied every Saturday for about a year, and some time off is in order. But Lean is tons of fun, and I'd like to get back into it once I'm recovered a bit.
Congratulations on the publication! As a dabbler in strictly typed functional programming languages like Scala and F#, I have always been curious about proof-oriented languages such as Coq or Agda, but found it difficult to justify the time investment. Lean seems to position itself as a theorem proving language that also supports general-purpose programs. Looking forward to digging into your book!
Those other languages are also definitely worth learning. Happily, there's lots of cross-transfer of ideas and skills between them, so learning one will make the others easier. I got my start in dependent types with Software Foundations and Coq, and that was very helpful when learning Agda and Idris later. Similarly, skills from them transferred quite readily to Lean.
Unfortunately not. I wanted to produce PDF and epub versions in parallel with the HTML version, but getting those to be of sufficient quality would have blown the time budget for the project. There's some old code in the Git history for dumping the source to Pandoc's dialect of Markdown, from which I was going to generate those, but the differences in Markdown dialects are enough that it was a fair bit of work.
Dropping a print version and focusing on epub could very well be faster, as I suspect that an mdbook->epub pipeline is less challenging than creating a quality print-ready PDF. But no plans right now.
* tooling (can't say it's better, I haven't used Idris, but it's at least a lot nicer than Agda's: rustup-like version manager `elan`, own build system `lake`, official vscode extension supporting mini web apps which can interact with the code)
Small things I can remember:
* do-notation with early return and imperative loops
* easier termination proof handling (provide some expression using function arguments with `decreasing_by` and prove it's decreasing with `termination_by` block, which may be automatic even for non-structural recursion because of some built-in tactic)
Regarding tactics: I started with Coq and I felt for a long time that I had no idea what tactics actually do. Proofs with tactics look nice on the surface, but I really did not like hidden state and hidden actions.
Redoing some proofs in Lean by construction of proof terms was eye-opening, it's just functional programming with dependent types! I still think that teaching with tactics first without exposing proof terms is a mistake.
I've learned to see the value of tactics later, again thanks to Lean: there are proofs that are more natural in the tactics style. Sometimes a mixed approach is ideal: growing intermediate proof terms from the premises and then wrangling the hypothesis with tactics to meet the lemmas.
Lean occupies a different point in the design space. Its type theory is simpler and more conservative, its metaprogramming system is more reminiscent of Racket's (including hygienic procedural macros), and the focus on supporting professional mathematicians gives a different feel to the community and libraries. I think both are worth knowing, but either is great to learn. I hope the two projects learn more from each other in the future.
in C and C++, the conditional statement is written using if and else, while the conditional expression is written with a ternary operator ? and :.
ending a sentence with . is a must. Including a period directly after a syntactic construct about ? and : is .. jarring because it is possible to attempt to read this as :. as an atomic construct in the language. Or even worse, the three-dots pyramid in some mathematical proof notations.
> Also, I always find it interesting the ternary operation is constructed in statements/expressions in ? ... : notation but actually has "two" operators there. the ? is the conditional, the : is the binary-choice. It's a language syntax choice which "side" is true or false and it is also true that you can't construct statements in : without a prior ? but .. its hardly a single operator in syntax terms if it has two disjoint components. := is an unambiguous single operation. condition ? x : y is ... not. It's definedly ambiguous!
In C++, the lexical construct you describe is an atomic operator[0] and neither an expression nor a statement:
The first operand of the conditional operator is
evaluated and contextually converted to bool.
After both the value evaluation and all side effects
of the first operand are completed, if the result
was true, the second operand is evaluated. If the
result was false, the third operand is evaluated.
I took my comments about ?...: notation out. Basically, at a syntactic level its tempting to regard ? as "the operator" and : as some other operator but really the ternary operation only exists as some hypothetical function in 3 arguments:
ternary(condition, true-path, false-path)
which executes either true-path or false-path depending on condition. But in notational terms, how "big" condition and true and false parts are is rather unconstrained, so you wind up with the ? and the : widely separated. They aren't operators in the higher sense: they're the syntax which forms the ternary operation as a whole over the expression and it's condition.
Lexing bleeds into syntax and syntax bleeds into semantics.
:. is a possible faux-pas in lexing the sentence around the ?...: construct because of :. being unfortunate.
> its hardly a single operator in syntax terms if it has two disjoint components
I don't think "operator in syntax terms" is nearly as well-defined as OP thinks. Why couldn't `?` be parsed as half an operator and `:` as the other half? There's no rule that says a parser has to call each distinct symbol token its own "operator". In fact, I'd argue that the only reason this might seem seem like a rule is that almost all of the other operators in common use are either unary or binary, making it easy to use a single token for the operator itself. That's why it's called the "ternary operator"; it's the only one that operates on three things! The only alternative to spreading it out across separate tokens with an operand in between is to put multiple operands in a row on one side of it; as confusing as `foo ? bar : baz` might be, I have strong doubts that `foo ?: bar baz` would be less confusing in most real-world cases.
I'm the author - I think that it's good to signal this kind of thing redundantly, and not rely on the details of typesetting to avoid confusion. I'll create an issue in the repo to rephrase the sentence.
While Lean4 is meant to be much more of a general purpose language than it's predecessor, it is still built with being a proof assistant in mind. That being said it is self hosted and has been in development and use for quite some time, which says something of it's readiness.
Tooling around it great (it is installed with a rustup fork called elan).
As far as performance, it uses a particular version of ref-counting that allows mutation of unique references based on run-time tracking rather than compile time. If you accidentally keep a reference to an array and modify it, then you end up copying it rather than just mutating it.
Lean is currently moving to the 4th iteration which is the first intended to be a general-purpose programming language. It "is currently being released as milestone releases towards a first stable release". For now the main goal is to port mathlib to the new version, and then they will concentrate on the compiler. So it is not production ready. But that doesn't mean it is not suitable for building any programs now. There is a simple raytracer written in Lean [1]. I have built a chip8 interpreter with it and the only problem was the lack of an ecosystem, meaning I had to build the necessary libraries myself.
Now it has a RC GC and boxes everything >= 64 bits (including struct fields and IO/ST results), and as the compiler isn't polished it is probably significantly slower. In the referenced raytracer repo you can find rendering time compared to the C implementation (Lean is 25x slower, but that was a year ago).
Those other languages are also definitely worth learning. Happily, there's lots of cross-transfer of ideas and skills between them, so learning one will make the others easier. I got my start in dependent types with Software Foundations and Coq, and that was very helpful when learning Agda and Idris later. Similarly, skills from them transferred quite readily to Lean.
Deleted Comment
Looking forward to read it during Summer time.
Dropping a print version and focusing on epub could very well be faster, as I suspect that an mdbook->epub pipeline is less challenging than creating a quality print-ready PDF. But no plans right now.
* tactics (proof scripts are a lot easier than manual proving)
* syntax extensibility (Racket-like, supports custom elaboration/delaboration)
* mathlib (library of formalized math)
* tooling (can't say it's better, I haven't used Idris, but it's at least a lot nicer than Agda's: rustup-like version manager `elan`, own build system `lake`, official vscode extension supporting mini web apps which can interact with the code)
Small things I can remember:
* do-notation with early return and imperative loops
* easier termination proof handling (provide some expression using function arguments with `decreasing_by` and prove it's decreasing with `termination_by` block, which may be automatic even for non-structural recursion because of some built-in tactic)
Redoing some proofs in Lean by construction of proof terms was eye-opening, it's just functional programming with dependent types! I still think that teaching with tactics first without exposing proof terms is a mistake.
I've learned to see the value of tactics later, again thanks to Lean: there are proofs that are more natural in the tactics style. Sometimes a mixed approach is ideal: growing intermediate proof terms from the premises and then wrangling the hypothesis with tactics to meet the lemmas.
https://youtu.be/p4IrbnPomXg
in C and C++, the conditional statement is written using if and else, while the conditional expression is written with a ternary operator ? and :.
ending a sentence with . is a must. Including a period directly after a syntactic construct about ? and : is .. jarring because it is possible to attempt to read this as :. as an atomic construct in the language. Or even worse, the three-dots pyramid in some mathematical proof notations.
In C++, the lexical construct you describe is an atomic operator[0] and neither an expression nor a statement:
0 - https://en.cppreference.com/w/cpp/language/operator_otherLexing bleeds into syntax and syntax bleeds into semantics.
:. is a possible faux-pas in lexing the sentence around the ?...: construct because of :. being unfortunate.
I don't think "operator in syntax terms" is nearly as well-defined as OP thinks. Why couldn't `?` be parsed as half an operator and `:` as the other half? There's no rule that says a parser has to call each distinct symbol token its own "operator". In fact, I'd argue that the only reason this might seem seem like a rule is that almost all of the other operators in common use are either unary or binary, making it easy to use a single token for the operator itself. That's why it's called the "ternary operator"; it's the only one that operates on three things! The only alternative to spreading it out across separate tokens with an operand in between is to put multiple operands in a row on one side of it; as confusing as `foo ? bar : baz` might be, I have strong doubts that `foo ?: bar baz` would be less confusing in most real-world cases.
Thanks for the feedback!
I was stunned when I discovered it, because it does exactly what its name suggests: it's a huge library of math proofs available as a library in Lean.
https://github.com/leanprover-community/mathlib
Tooling around it great (it is installed with a rustup fork called elan).
As far as performance, it uses a particular version of ref-counting that allows mutation of unique references based on run-time tracking rather than compile time. If you accidentally keep a reference to an array and modify it, then you end up copying it rather than just mutating it.
Now it has a RC GC and boxes everything >= 64 bits (including struct fields and IO/ST results), and as the compiler isn't polished it is probably significantly slower. In the referenced raytracer repo you can find rendering time compared to the C implementation (Lean is 25x slower, but that was a year ago).
[1] https://github.com/kmill/lean4-raytracer