On this page:
8.1 Verifying Compilers with Simulations
8.2 Problem 1:   Helper theorems relating Rstar to TStep
8.3 Problem 2:   Substitution commutes with compile.
8.4 Problem 3:   Show that the simulation respects compiler.
8.5 Problem 4:   Simulation is preserved over reduction
8.6 Problem 5:   Simulation is preserved over multiple steps
8.7 Problem 6:   Prove the compiler is correct
8.7

8 Homework 10

Released: Wednesday, March 22, 2023 at 6:00pm

Due: Wednesday, March 29, 2023 at 6:00pm

What to Download:

https://github.com/logiccomp/s23-hw10/raw/main/hw10.lean

Please start with this file, filling in where appropriate, and submit your homework to the hw10 assignment on Gradescope. This page has additional information, and allows us to format things nicely.

To do this assignment in the browser: Create a Codespace.

8.1 Verifying Compilers with Simulations

Earlier, we’ve seen simpler compiler correctness proofs of languages that were just arithmetic, or perhaps, arithmetic with variables. In this assignment, we’ll introduce slightly more interesting languages, in order to demonstrate the main technique used in proving compilers correct: simulation relations.

Compiler correctness, at its essence, amounts to the following theorem:

Given a source program E_s, if it is compiled to a target program E_t, then runinng E_s \overset{*}{\rightarrow} v_s (to a source value v_s) and running the target program E_t \overset{*}{\rightarrow} v_t (to a target value v_t) should result in related (or equivalent) target values. In languages with side effects (printing, network, etc), they should also produce the same effects.

The most common way of proving this is with a so-called "simulation" relation, which describes a relationship between source a target programs that should hold as the programs evaluate. So rather than trying to prove, all at once, that v_s is equivalent to v_t (the final values after runinng the source and target program), we instead just show that after each step, the simulation relation still holds. Technically, what it will say is that if the source term takes a step, then after zero or more steps the target will reach a term that is in the simulation relation with the new source term. This allows the target to take more (or fewer) steps than the source. Provided this relation holds as we go, and the relation relates equal values to values, once the programs terminate, the relation will give us what we need, that v_s is equivalent to v_t.

In this assignment, you will prove a simulation proof for a compiler that compiles booleans to numbers, in a small functional language that has functions, variables, function application, and either booleans (and ‘and‘) or numbers (and ‘plus‘ and ‘minus‘). By including functions (which are untyped), we end up including a lot of the complexity of more interesting languages (e.g., recursion, etc), but without adding lots of cases to our syntax (which would translate to more cases of the proof).

We first give you a background definitions, which we talked about in class (in L27, 3/22).

inductive Rstar {T : Type} (P : T -> T -> Prop) : T -> T -> Prop :=
| refl : forall a, Rstar P a a
| step : forall a b c, P a b -> Rstar P b c -> Rstar P a c

theorem RTrans {T : Type} {P : T -> T -> Prop} {a b c : T} : 
        Rstar P a b -> Rstar P b c -> Rstar P a c :=
  by intros H1
     revert c
     induction H1 <;> intros c H2 <;> simp <;> try assumption
     constructor
     assumption
     rename_i H
     apply H
     assumption

And definitions for the source & target language, which includes substitution functions to handle function application:

inductive Source :=
| b : Bool -> Source
| and : Source -> Source -> Source
| var : String -> Source
| lam : String -> Source -> Source
| app : Source -> Source -> Source

def Source.sub (e : Source) (x : String) (body : Source) :=
  match body with
  | b _ => body
  | and t1 t2 => and (e.sub x t1) (e.sub x t2)
  | var x' => if x == x' then e else body
  | app t1 t2 => app (e.sub x t1) (e.sub x t2)
  | lam x' body' => if x == x' then body
                    else lam x' (e.sub x body')

inductive SStep : Source -> Source -> Prop :=
| app : forall e e' a, SStep e e' ->
                  SStep (Source.app e a) (Source.app e' a)
| beta : forall x1 body e,
    SStep (Source.app (Source.lam x1 body) e)
        (e.sub x1 body)
| and1 : forall e1 e1' e2, SStep e1 e1' ->
                        SStep (Source.and e1 e2) (Source.and e1' e2)
| and2 : forall b e2 e2', SStep e2 e2' ->
                      SStep (Source.and (Source.b b) e2) (Source.and (Source.b b) e2')
| and : forall b1 b2, SStep (Source.and (Source.b b1) (Source.b b2))
                        (Source.b (b1 && b2))

inductive Target :=
| n : Nat -> Target
| plus : Target -> Target -> Target
| minus : Target -> Target -> Target
| var : String -> Target
| lam : String -> Target -> Target
| app : Target -> Target -> Target

def Target.sub (e : Target) (x : String) (body : Target) :=
  match body with
  | n _ => body
  | plus t1 t2 => plus (e.sub x t1) (e.sub x t2)
  | minus t1 t2 => minus (e.sub x t1) (e.sub x t2)
  | var x' => if x == x' then e else body
  | app t1 t2 => app (e.sub x t1) (e.sub x t2)
  | lam x' body' => if x == x' then body
                      else lam x' (e.sub x body')

inductive TStep : Target -> Target -> Prop :=
| app : forall e e' a, TStep e e' ->
                  TStep (Target.app e a) (Target.app e' a)
| beta : forall x1 body e,
    TStep (Target.app (Target.lam x1 body) e)
          (e.sub x1 body)
| plus1 : forall e1 e1' e2, TStep e1 e1' ->
                        TStep (Target.plus e1 e2) (Target.plus e1' e2)
| plus2 : forall e1 e2 e2', TStep e2 e2' ->
                        TStep (Target.plus e1 e2) (Target.plus e1 e2')
| plus : forall n1 n2, TStep (Target.plus (Target.n n1) (Target.n n2))
                        (Target.n (n1 + n2))
| minus1 : forall e1 e1' e2, TStep e1 e1' ->
                      TStep (Target.minus e1 e2) (Target.minus e1' e2)
| minus2 : forall e1 e2 e2', TStep e2 e2' ->
                      TStep (Target.minus e1 e2) (Target.minus e1 e2')
| minus : forall n1 n2, TStep (Target.minus (Target.n n1) (Target.n n2))
                          (Target.n (n1 - n2))

We also give you the compiler:

def Source.compile : Source -> Target 
| b true    => Target.n 1
| b false   => Target.n 0
| and b1 b2 => Target.minus (Target.plus (compile b1) (compile b2)) (Target.n 1)
| var x     => Target.var x
| app t1 t2 => Target.app (compile t1) (compile t2)
| lam x bdy => Target.lam x (compile bdy)

And the simulation relation:

inductive sim : Source -> Target -> Prop :=
| comp : forall e, sim e e.compile
8.2 Problem 1: Helper theorems relating Rstar to TStep

In this first problem, you’ll prove various helper theorems relating to the evaluation of the target (number containing) language.

-- 7 lines 
theorem app_star : forall e1 e1' e2, Rstar TStep e1 e1' -> 
                  Rstar TStep (Target.app e1 e2) (Target.app e1' e2) :=
 by sorry

-- 7 lines
theorem plus_star1 : forall e1 e1' e2, Rstar TStep e1 e1' -> 
                  Rstar TStep (Target.plus e1 e2) (Target.plus e1' e2) :=
 by sorry

-- 7 lines
theorem plus_star2 : forall e1 e2 e2', Rstar TStep e2 e2' -> 
                  Rstar TStep (Target.plus e1 e2) (Target.plus e1 e2') :=
 by sorry

-- 7 lines
theorem minus_star1 : forall e1 e1' e2, Rstar TStep e1 e1' -> 
                  Rstar TStep (Target.minus e1 e2) (Target.minus e1' e2) :=
 by sorry

-- 7 lines
theorem minus_star2 : forall e1 e2 e2', Rstar TStep e2 e2' -> 
                  Rstar TStep (Target.minus e1 e2) (Target.minus e1 e2') :=
 by sorry
8.3 Problem 2: Substitution commutes with compile.

In this problem, you will prove that substituting and then compiling is the same as compiling and then substituting. This will be a necessary result in the ‘beta‘ case of ‘sim_step‘.

Think about whether the expression being substituted, or the expression substituted into (e or body) make more sense to do induction on. Also: in the cases where variable comparison is used (var & lam), remember L23 (3/13).

-- 15 lines
theorem compile_sub : forall (e : Source) x (body : Source),
  (e.sub x body).compile = Target.sub e.compile x body.compile :=
 by sorry
8.4 Problem 3: Show that the simulation respects compiler.

In some relations, this is involved: in ours, this should be trivial! But it is a necessary step, as logically, it is how we start our argument: first, we show that the source & target term are in the relation. Then we should that at each step, they remain; finally, we should that when they terminate, they terminate at related values.

-- 1 line
theorem compile_sim : forall t,
  sim t t.compile := 
 by sorry
8.5 Problem 4: Simulation is preserved over reduction

This is the heart of the proof. It says that for a single step of the source, after any number of steps of the target, there is some target term that is related again. i.e., we can always get back to a pair of related terms. We will use this iteratively in the next proof.

Be sure to use the theorems you did in Problem 1 & Problem 2, and as a hint: you will want to do induction on the SSTep relation.

-- 67 lines
theorem sim_step : forall t1 t1',
  SStep t1 t1' ->
  forall t2,
  sim t1 t2 ->
  exists t2', Rstar TStep t2 t2' /\ sim t1' t2' := 
 by intros t1 t1' stept1
    induction stept1 <;> intros t2 sim1 <;> sorry
8.6 Problem 5: Simulation is preserved over multiple steps

This problem shows that if we take many steps at the source, the result from the previous theorem still holds. It is much easier, since we can appeal to the single step result (where most of the work is done).

-- 15 lines
theorem step_sim_star : forall t1 t1',
  Rstar SStep t1 t1' ->
  forall t2,
  sim t1 t2 ->
  exists t2', Rstar TStep t2 t2' /\ sim t1' t2' := 
 by intros t1 t1' rst1
    induction rst1 <;> intros t2 sim1 <;> sorry
8.7 Problem 6: Prove the compiler is correct
-- 7 lines
theorem correct : forall t b, Rstar SStep t (Source.b b) ->
                              Rstar TStep t.compile ((Source.b b).compile) := 
 by sorry