Lecture 27: Compiler Correctness
1 Purpose
Learn how compilers are verified.
2 Outline
First, we are going to talk about a new logical construct: the existential. In logic, \exists x. P(x) means that there is some x such that P, some logical statement, holds on x. For example, if P were the statement "student x is named Alex", then \exists x.P(x) would be a true statement if there were a student named Alex. This is in contrast to \forall x.P(x) which means all students must be named Alex for the statement to be true.
We can encode existentials in Lean by using an inductive type:
inductive Exists {α : Sort u} (p : α → Prop) : Prop where
| intro (w : α) (h : p w) : Exists p
The type Exists P
, thus, is a type that can be constructed using Exists.intro
by passing it two things: a value w
and a proof that P
holds on w
.
This is a slightly different version of the existential than the one typically used in math: like everything else in Lean, it is constructive. What that means is that if we have a proof of Exists P, not only do we know that there is some element for which P holds, we actually have such an element! Whereas in the classical existential, just knowing \exists x.P(x) doesn’t mean that you actually have such an x.
There is syntactic shorthand for writing this like you write forall, as we can see in this theorem. You can also see a tactic, which calls the constructor for you, with the particular value:
theorem T : exists x : Nat, x + x = 4 :=
by exists 2
The exists
tactic will do some basic simplification and close obvious goals after plugging in the value.
Tactic Use | Tactic Description |
| Given a goal of type |
3 Compiler Correctness
Earlier on in the semester, we talked at a high level about what it means for a compiler, which is a program that translates code written in one programming language into another programming language, to be correct.
In English, the statement, often abbreviated as semantics preservation, is that the meaning of the source language should be preserved by compilation to the target.
Formally, this requires giving meaning to both the source language and the target language. Usually, the target language already has a well-defined meaning: you can run the programs. Source languages may not already have very formal definitions, but they usually at least have informal definitions, in the form of documentation, language specifications, etc. One of the first tasks, often, for a compiler verification effort is to give a formal definition of the source language.
With just a definition of the source and target language (as, say, interpreters that allow us to run the programs), one could use property based testing to try to verify that the compiler works. However, there are a couple reasons why compilers have been a main target of formal verification efforts with tools like Lean: first, running programs, especially programs that are not tiny, can take a while! If you are trying to test your compiler on thousands or millions of programs, if each of them takes even a couple seconds to run, this could be a very very slow process. But, more significantly, most languages are incredibly complex, with many different features that potentially interact in complex ways. Indeed, the amount of combinations of different features means that actually covering a representative set of programs may be very difficult. This isn’t to say that doing randomized testing isn’t worthwhile: this kind of testing (often called "fuzz testing", which it is just looking for crashes) is used very effectively to find bugs. But in programs as complex as compilers (which may be hundreds of thousands of lines, if not millions), there is little hope that you can find all the bugs by just randomly testing.
While the overall theorem statement remains the same, when it comes time to actually prove that the compiler preserves semantics, the particular approach to proof becomes quite important. In particular, doing naive induction on, e.g., the source program, most likely won’t work on any interesting programming language. Indeed, even in the small one we’ll use for HW, that approach would fail. Instead, we need a more sophisticated proof strategy. There are various strategies that people have come up with: we’ll use the most common one used in practice, which is called a simulation.
At a high level, the idea is quite simple: we define a new relation, called the simulation relation, and we then prove that as the source and target program evaluate, they remain related by this relation. Provided we can show that at every step, they remain connected by this relation, then we can show that when the source program terminates (if it does), the target program must terminate at a related value, which is generally what we’ll need to show correctness.
One thing this requires is that we have a way of talking about single steps of evaluation: we need them as data, not as code, so we can write proofs that inspect them.
4 Small-step Semantics
First, we define a relation that shows how a single step of evaluation happens. Then, we have another relation called Rstar
. This is a relation that takes another relation P
as an argument: in our case, it’ll take our single step relation. Rstar
has two constructors: reflexivity (two elements are related to themselves) and the step relation (if P holds from a to b, and Rstar P holds from b to c, then Rstar P holds from a to c).
inductive ArithExpr :=
| num (n : Nat)
| add (e1 e2 : ArithExpr)
def ArithExpr.eval : ArithExpr → Nat
| num n => n
| add e1 e2 => e1.eval + e2.eval
inductive ArithExpr.step : ArithExpr -> ArithExpr -> Prop where
| add1 : forall e1 e1' e2,
ArithExpr.step e1 e1' ->
ArithExpr.step (ArithExpr.add e1 e2) (ArithExpr.add e1' e2)
| add2 : forall e1 e2 e2',
ArithExpr.step e2 e2' ->
ArithExpr.step (ArithExpr.add e1 e2) (ArithExpr.add e1 e2')
| add : forall n1 n2,
ArithExpr.step (ArithExpr.add (ArithExpr.num n1)
(ArithExpr.num n2))
(ArithExpr.num (n1 + n2))
inductive Rstar {T : Type} (P : T -> T -> Prop) : T -> T -> Prop :=
| refl : forall a, Rstar P a a
| step : forall a b c, P a b -> Rstar P b c -> Rstar P a c
We can prove helper theorems that show how evaluation of subterms gets lifted into the evaluation of the whole term.
theorem rstar_add1 : forall e1 e1' e2,
Rstar ArithExpr.step e1 e1' ->
Rstar ArithExpr.step (ArithExpr.add e1 e2) (ArithExpr.add e1' e2) :=
by intros e1 e1' e2 rst
induction rst
simp [*, Rstar.refl]
apply Rstar.step
apply ArithExpr.step.add1
assumption
assumption
theorem rstar_add2 : forall e1 e2 e2',
Rstar ArithExpr.step e2 e2' ->
Rstar ArithExpr.step (ArithExpr.add e1 e2) (ArithExpr.add e1 e2') :=
by intros e1 e2 e2' rst
induction rst
simp [*, Rstar.refl]
apply Rstar.step
apply ArithExpr.step.add2
assumption
assumption
Once we have a compiler, we can define a relation that shows how the source program and the target program are related. The simplest relation, which we’ll be able to use for the compiler in homework, says that a term is related to its compile version:
inductive sim : ArithExpr -> ArithExpr -> Prop where
| comp (e : ArithExpr) : sim e (compile e)
Then, we prove four theorems, which take us step-by-step to the end:
theorem compile_sim : forall e : ArithExpr,
sim e (compile e) := by sorry
theorem sim_step : forall t1 t1',
ArithExpr.step t1 t1' ->
forall t2,
sim t1 t2 ->
exists t2', Rstar ArithExpr.step t2 t2' /\ sim t1' t2' := by sorry
theorem step_sim_star : forall t1 t1',
Rstar ArithExpr.step t1 t1' ->
forall t2,
sim t1 t2 ->
exists t2', Rstar ArithExpr.step t2 t2' /\ sim t1' t2' := by sorry
theorem correct : forall e n, Rstar ArithExpr.step e (ArithExpr.num n) ->
Rstar ArithExpr.step (compile e) (ArithExpr.num n) := by sorry
In the case that the simulation relation is the compiler, the first theorem is trivial. The second theorem, which shows that the relation is preserved over the step, is generally where most of the work happens. You have to do it over induction of the step relation, and do a lot of case analysis over what the simulation relation and the induction hypotheses tell you. The third theorem is usually a little bit of work, but mostly follows from the second. And the last one is usually pretty trivial: just applying the third and doing case analysis.
5 A complete proof
Here is a complete simulation proof, in this case of constant folding for ArithExpr
s. While the technique is not required in that case, it is required for the compiler in homework, an the structure of the proof will be quite similar for homework, so it is a good model.
NOTE: It turns out that constant folding needs a more complicated relation than the one for HW, so this proof does not have quite the same structure as the one for HW, though the overall approach is similar (essentially, in this proof, when you do case analysis on the simulation relation, you might get _multiple_ cases, whereas in HW, doing case analysis on the simulation will always just tell that the right side is the compiled version of the left).
inductive ArithExpr :=
| num (n : Nat)
| add (e1 e2 : ArithExpr)
inductive ArithExpr.step : ArithExpr -> ArithExpr -> Prop where
| add1 : forall e1 e1' e2,
ArithExpr.step e1 e1' ->
ArithExpr.step (ArithExpr.add e1 e2) (ArithExpr.add e1' e2)
| add2 : forall e1 e2 e2',
ArithExpr.step e2 e2' ->
ArithExpr.step (ArithExpr.add e1 e2) (ArithExpr.add e1 e2')
| add : forall n1 n2,
ArithExpr.step (ArithExpr.add (ArithExpr.num n1)
(ArithExpr.num n2))
(ArithExpr.num (n1 + n2))
theorem RTrans {T : Type} {P : T -> T -> Prop} {a b c : T} :
Rstar P a b -> Rstar P b c -> Rstar P a c :=
by intros H1
revert c
induction H1 <;> intros c H2 <;> simp <;> try assumption
rename_i H
constructor
assumption
apply H
assumption
def ArithExpr.const_fold : ArithExpr -> ArithExpr
| num n => num n
| add (num n1) (num n2) => num (n1 + n2)
| add e1 e2 => add e1.const_fold e2.const_fold
theorem rstar_add1 : forall e1 e1' e2,
Rstar ArithExpr.step e1 e1' ->
Rstar ArithExpr.step (ArithExpr.add e1 e2) (ArithExpr.add e1' e2) :=
by intros e1 e1' e2 rst
induction rst
simp [*, Rstar.refl]
apply Rstar.step
apply ArithExpr.step.add1
assumption
assumption
theorem rstar_add2 : forall e1 e2 e2',
Rstar ArithExpr.step e2 e2' ->
Rstar ArithExpr.step (ArithExpr.add e1 e2) (ArithExpr.add e1 e2') :=
by intros e1 e2 e2' rst
induction rst
simp [*, Rstar.refl]
apply Rstar.step
apply ArithExpr.step.add2
assumption
assumption
/-NOTE: for constant folding, having the simulation relation
be just the compiler is not flexible enough. it turns
out we need a "maybe" compiler: i.e., it can sometimes
constant fold (plus_comp) and sometimes not (plus_ind).
-/
inductive sim : ArithExpr -> ArithExpr -> Prop where
| plus_comp (n1 n2 : Nat) : sim (ArithExpr.add (ArithExpr.num n1)
(ArithExpr.num n2))
(ArithExpr.num (n1 + n2))
| plus_ind (e1 e1' e2 e2' : ArithExpr) :
sim e1 e1' -> sim e2 e2' ->
sim (ArithExpr.add e1 e2) (ArithExpr.add e1' e2')
| n (n : Nat) : sim (ArithExpr.num n) (ArithExpr.num n)
--| comp (e : ArithExpr) : sim e (e.const_fold)
theorem step_add1 : forall e1 e1' e2, ArithExpr.step e1 e1' -> ArithExpr.const_fold (ArithExpr.add e1 e2) = ArithExpr.add (ArithExpr.const_fold e1) (ArithExpr.const_fold e2) := by sorry
theorem sim_step : forall t1 t1',
ArithExpr.step t1 t1' ->
forall t2,
sim t1 t2 ->
exists t2', Rstar ArithExpr.step t2 t2' /\ sim t1' t2' :=
by intros t1 t1' step
induction step <;> intros t2 sim'
case add1 e1 e1' e2 step1 IH =>
cases sim'
case plus_comp n1 n2 =>
exists (ArithExpr.num (n1 + n2))
case plus_ind a1 a2 sim1 sim2 =>
cases (IH a1 sim1)
case intro t2' H =>
cases H
exists (ArithExpr.add t2' a2)
constructor
apply rstar_add1
assumption
constructor
assumption
assumption
case add2 e1 e2 e2' step2 IH =>
cases sim'
case plus_comp n1 n2 =>
exists (ArithExpr.num (n1 + n2))
case plus_ind a1 a2 sim1 sim2 =>
cases (IH a2 sim2)
case intro t2' H =>
cases H
exists (ArithExpr.add a1 t2')
constructor
apply rstar_add2
assumption
constructor
assumption
assumption
case add n1 n2 =>
cases sim'
case plus_comp =>
exists (ArithExpr.num (n1 + n2))
constructor
constructor
constructor
case plus_ind e1 e2 sim1 sim2 =>
cases sim1
cases sim2
exists (ArithExpr.num (n1 + n2))
constructor
apply Rstar.step
apply ArithExpr.step.add
constructor
constructor
/-NOTE: because the simulation relation, in this case, is not
just the compiler, this proof is _not_ trivial (though it
is not particularly difficult). In th compiler for HW,
however, this is a one or two line theorem. -/
theorem compile_sim : forall e : ArithExpr,
sim e e.const_fold :=
by intros e
induction e
case num n =>
simp [ArithExpr.const_fold]
constructor
case add e1 e2 sim1 sim2 =>
cases e1 <;> cases e2
case num.num =>
constructor
case num.add =>
constructor
constructor
assumption
case add.num =>
constructor
assumption
constructor
case add.add =>
constructor
assumption
assumption
theorem step_sim_star : forall t1 t1',
Rstar ArithExpr.step t1 t1' ->
forall t2,
sim t1 t2 ->
exists t2', Rstar ArithExpr.step t2 t2' /\ sim t1' t2' :=
by intros t1 t1' rst
induction rst <;> intros t2 simt
case refl =>
exists t2
constructor
constructor
assumption
case step a b c stepab rstbc IH =>
cases (sim_step _ _ stepab _ simt)
case intro w H =>
cases H
case intro rst2 simb =>
cases (IH _ simb)
case intro w' H =>
cases H
exists w'
constructor
apply RTrans
assumption
assumption
assumption
theorem correct : forall e n,
Rstar ArithExpr.step e (ArithExpr.num n) ->
Rstar ArithExpr.step e.const_fold (ArithExpr.num n) :=
by intros e n rst1
cases (step_sim_star _ _ rst1 _ (compile_sim e))
case intro w H =>
cases H
case intro H1 H2 =>
cases H2
assumption