Lecture 33: Introduction to Proof
1 Purpose
Discuss limitations of PBT, introduce interactive theorem provers
2 Outline
Throughout the semester, we’ve seen the enormous variety of logical invariants we can express using Property Based Testing. We’ve also, in a more limited way, seen how we can use SMT solvers to exhaustively test certain spaces, of inputs. SMT solvers are, indeed, widely used in industry.
The obvious weakness of PBT, which we have discussed many times, is that it only tests a certain subset of the inputs. In order to have that be useful, we have to be sure that we are testing a representative subset, and to know if it representative, we may have to already know where bugs may appear: possibly not a reasonable assumption! Despite that, it is remarkably flexible and useful.
SMT solvers don’t have that weakness: they exhaustively test their input, but their weakness is that they can only handle finite inputs. So they require than we finitize our inputs and our computations. e.g., we can’t deal with arbitrary lists, but rather bounded lists of length less than 5. If our code had a bug on length 6, or 10, or 100, then this testing, despite being "exhaustive", would not find it. For many applications, this is good enough – it’s certainly better than how the vast majority of software is tested – but there are domains (medical devices, cryptography, fly-by-wire systems, etc) where we want to do better, and logic gives us tools.
How can we possibly reason about data of arbitrary size? Or at least, arbitrary size that will fit into our computers? We certainly can’t run on all inputs. The key idea is the same one as how we write recursive functions on data of arbitrary size.
Let’s consider one of the simplest possible functions:
f(x : \mathbb{N}) \{0 + x \}
And we want to show that f(x) = x for all natural numbers. If we were happy testing a few random numbers, we could use PBT, and if we were happy testing numbers up to some reasonable bit width, SMT would do very well. But we want to know this holds for any natural number.
In particular, we want to be sure that the function doesn’t do, for example, this:
f(x : \mathbb{N}) \{if (x > 2^{64}) \{0\} else \{0 + x \}\}
i.e., if the number is beyond any reasonable bound that we would have handed to an SMT solver, it no longer works: instead of returning the same number, it just returns 0. How can we prove this?
To understand this, let’s look carefully at how we can define natural numbers. In Lean, we define them as:
inductive Nat where
| zero
| succ (n : Nat)
i.e., a Nat
is either zero
, or it is the successor of some other Nat
. This is an inductive
definition, as it is defined in terms of its self, or inductively.
To prove our theorem, that \forall x, f(x) = x, we consider the two possibilities for our input, and prove that in either case, the statement holds. We define our proof recursively; indeed, our proof is a recursive function.
That means that in the succ
case, we can call our proof, which is a recursive function,
on the Nat
that is inside (one less than our current value).
This works similarly to how we define recursive functions. Because proofs can get somewhat unwieldy, we will often use
tactics to help construct them. Tactics are procedures that can inspect the current partially constructed proof and add to it. We switch into tactic mode with by
. This allows our theorem, among other things, to construct trivial equality proofs with rfl
and rewrite what we are proving using equalities using rw
and a comma-separated list of equalities inside of square brackets. Note that, in general, we can only directly prove trivial
equalities (using rfl
); for more interesting ones, we first have to rewrite them (using rw
) to a trivial equality, which is exactly what we do here.
def nat_0_plus : forall (x : Nat), Nat.add 0 x = x
| Nat.zero => by rfl
| Nat.succ n => by rw [Nat.add, nat_0_plus n]
This is now a recursive definition that, for any natural number, constructs
a proof that 0 + x is equal to x, which invokes a recursive call to
the proof in the case that x is not zero, and uses the trivial case
(that 0 + 0 = 0), apparent from the definition, in the base case. In the recursive
case, we first rewrite using the definition of addition. This gets from
Nat.add 0 (Nat.succ n) = Nat.succ n
to Nat.succ (Nat.add 0 n) = Nat.succ n
, since
the definition of Nat.add
is itself a structurally recursive definition defined on
the second argument. Now, we can complete the proof by rewriting using a recursive call,
which gives us that Nat.add 0 n = n
.
I intentionally used the syntax for function definitions; you can equally use the almost synonym theorem
(the only difference is that theorems cannot be used in executable code, & you cannot eval them) instead of def
:
theorem nat_0_plus : forall (x : Nat), Nat.add 0 x = x
| Nat.zero => by rfl
| Nat.succ n => by rw [Nat.add, nat_0_plus n]
A couple other important things in this, our first proof: first, note that we
have the same forall
construct as we’ve seen before. Lean is a higher-order logic, which
means that we can quantify over terms with quantification (we’ll see exactly what that means later). Next, note that our theorem statement is
the type of our function: it’s what we return. This is the type of an equality proof (=
is
actually just another inductive type, though one for which there is lots of built in support for working with).
The idea that we construct proofs as functional programs, and the types are the theorem statements, is a very important idea, often called the Curry-Howard isomorphism, and it first showed up in a theorem prover called AUTOMATH, created by the mathematician and computer scientist N.G. de Bruijn in 1968. We talked about this a little in Lecture 20: Recursion & Induction.
It’s worth actually reflecting a bit on the history of these theorem provers, and really, the notion of truth that they give us.
3 What is proof?
To start, we need to be clear about what a proof is, and in particular, what it means within the Lean Theorem Prover. First, what is a proof? A proof is an argument for some statement, described so that someone else can follow, and in doing so be convinced of the statement. Proofs are thus, historically, very social and contextual things: a step that the reader will be able to jump over can be left out, for example.
Related to proofs, and in some sense, simpler, are counter-examples: these are refutations of statements by means of a concrete example. For example, the statement "all prime numbers are odd" can be refuted by giving the counter-example 2, which is both a prime number and not odd (related, the fact that finding contradictions is often easier is the reason why many proofs in math assume the statement to be proved is not true, and then use that to derive a contradiction. It’s also how Property Based Testing works!).
There is still context, of course: what is a prime number, what is odd, etc: every bit of mathematics and logic builds upon some shared social understanding of the meaning of certain things that people over centuries (or millenia) have found useful in trying to understand the world.
4 What is mechanized proof?
Much more recently, but nearly as old as computers (1950s), people have been trying to use computers to carry about, or assist in carrying out, proofs. The appeal is two-fold: first, if theorems are intended to be a series of steps, each which follow according to some logical rule from the previous, computers are quite good at following steps without forgetting ones in the middle. Humans are certainly more creative, but this comes at a cost: we sometimes make leaps without explanation. Of course, sometimes those leaps are correct, and some of the most creative, baffling mathematics arises from those kind of jumps (e.g., some of the number theory of Ramanujan, whose results were divinely inspired, appearing to him in dreams). However, sometimes these leaps are difficult to confirm, or end up turning out to be not true!
The other reason why computers seemed appealing, even from the beginning, was the thought that if proof was merely the application of logical steps, one after another, perhaps proof could be rendered mechanical, and people could focus on the "interesting" parts: developing new theories, adding new rules, and posing new conjectures (to be automatically proved or disproved).
It turned out that the initial optimism was somewhat misplaced: while for limited theories, computers can come up with proofs, and indeed, SMT is probably the most amazing result from that realm, beyond that, things become intractable.
Indeed, for most interesting math, there are simply too many possible approaches, and there is no tractable procedure by which a proof will be discovered. This led to an alternative approach, and a new set of tools, proof assistants, of which Lean is one of the more modern incarnations.
In a proof assistant, a person is primarily responsible for constructing the proof, but each step that they carry out is checked to be valid. Thus, while proofs cannot be discovered automatically (at least, not in general), even quite rich proofs can be checked automatically, and that still turns out to be incredibly valuable, as it eliminates a serious potential risk. Assuming one trusts the proof checker (which, we’ll talk about), one need not trust the person who wrote the proof, only that the statement says what we want it to say. Given that proofs can often be orders of magnitude more complicated than theorem statements, this is a huge benefit.
5 What’s in Lean?
There are two key ideas in Lean. The first I mentioned earlier, is the idea that theorems are types and proofs are terms. While some people are enamored with this from a theoretical perspective, it’s appeal is quite practical: it means we can reuse key ideas across programming and proving. Indeed, while in some ways, mechanized theorem proving will feel quite different, in some ways, it should feel familiar: like learning a new programming language, just a very very different one.
There’s another idea, which actually comes from an entirely different theorem prover, developed around the same time as de Bruijn’s AUTOMATH. Called LCF (or Edinburgh LCF), created by Robin Milner in 1972, the main idea was to have a trusted core that protected logical truth, and then have a flexible language with which to define conveniences for writing proofs. What Milner realized, even then, was that mechanized proofs could be very tedious things, and much easier if you could leverage programming to make it easier. These procedures, called tactics, are a key idea in Lean: indeed, we pretty much won’t write proofs without using some tactics!
We used some tactics in the above proof: rw
is one of the main ones we’ll
use, and it is the main way that we use equalities in our proofs.
rfl
is also a tactic, but a simpler one: it constructs a proof of equality
between two terms that are equivalent. Equality is itself an inductive type,
with a single constructor that allows reflexive proofs, and
so rfl
is really just a convenience over the constructor Eq.refl
.
While we are motivated to move to Lean by proving properties involving infinite types, we will begin our journey at the beginning: with Lean as a programming language.
6 Installing Lean
1. Download & Install Visual Studio Code |
2. Extensions -> Lean4 (install it) |
3. Open the Lean 4 setup guide by creating a new text file using |
"File > New Text File" (Ctrl+N), clicking on the ∀-symbol in the top right and selecting |
"Documentation… > Setup: Show Setup Guide". |
4. Follow the Lean 4 setup guide. It will walk you through learning |
resources for Lean 4, teach you how to set up Lean's dependencies |
on your platform, install Lean 4 for you at the click of a button |
and help you set up your first project. |
7 Programming in Lean
Lean is a typed functional programming language. Its type system is more powerful than nearly all others, but it shares many features with other typed functional languages like OCaml, Haskell, Scala, etc.
For example: all values have types, which we can inspect with #check
; this is part of the interactive
nature of Lean:
#check true
#check "hello"
We can make def
inititions:
def m : Nat := 1 -- m is a natural number
def n : Nat := 0
def b1 : Bool := true -- b1 is a Boolean
def b2 : Bool := false
Here we’ve given explicit type annotations to each of the definitions. Often, Lean can infer these, but we will usually write them explicitly both to avoid any ambiguity and because it can make it easier to read.
We can, of course, ask for the type of definitions and expressions using them.
#check m -- output: Nat
#check n
#check n + 0 -- Nat
#check m * (n + 0) -- Nat
#check b1 -- Bool
#check b1 && b2 -- "&&" is the Boolean and
#check b1 || b2 -- Boolean or
And since it is a programming language, we can also evaluate expressions:
#eval 5 * 4 -- 20
#eval m + 2 -- 3
#eval (m + 7) % 2 -- 0
#eval b1 && b2 -- false
def
can also be used to create functions, which are expressions that are abstracted over (typed) arguments:
def g (s : String) : Bool := s.length > 0
Here, we are using an operation String.length
to calculate how long a string is: Lean allows, as a convenience, for functions
that are in a types namespace to be applied using dot notation (as above). This works because s
has type String
, so
Lean can figure out that s.length
means call String.length s
. Without the dot notation, the same
function could be written as:
def g (s : String) : Bool := String.length s > 0
This form of definition is really just a shorthand over a definition for a function value: as a functional language, functions are first class. The above could have been written as:
def g : String -> Bool := fun s => s.length > 0
Or, if we wanted to add more type annotations:
def g : String -> Bool := fun (s : String) => s.length > 0
Note that fun
is how we create anonymous functions (or lambdas, as they are often called), and function types
are written with ->
.
We can also use other typical data structures, like lists:
def my_length (l : List Nat) := match l with
| [] => 0
| _ :: xs => 1 + my_length xs
def my_length' : List Nat -> Nat
| List.nil => 0
| List.cons _ xs => 1 + my_length' xs
def my_length'' (l : List Nat) := List.length l
def my_length''' (l : List Nat) := l.length