Lecture 11.3: Control Flow and Compiler Correctness

Previously, we defined a compiler that consists of two compiler passes:

let compile (e : CalcLang.expr) : StackLang.program =
  let e' = uniquify StrMap.empty e in
  let prg = lower e' in
  prg

Our compiler consists of two passes: a uniquify pass to ensure we don't need to worry about name shadowing, and a lower pass to turn our CalcLang programs into equivalent StackLang ones.

That said, we have been ignoring how to implement control flow and branching in StackLang, and our lower function currently throws a runtime error if we pass in a CalcLang expression containing ITE.

Control flow with jumps

We can describe an if-then-else expression in CalcLang as a "branching" statement: if the guard is true, we travel down the first branch; otherwise, we travel down the second. StackLang, however, is represented as a straight-line program: a list of instructions to execute. So we will need similar operations to work on these lists. Let's take a look at the following CalcLang program:

CalcLang: if true then 10 else 20

We can represent each component of this expression as a StackLang subprogram:

the guard true is defined by the StackLang program [Push 1],
the equivalent StackLang program for the true branch simply pushes 10 onto the stack: [Push 10],
likewise with the false branch we will push 20 onto the stack: [Push 20].

If we naively concatenate these programs, we will end up with a program of [Push 1; Push 10; Push 20], which will produce a stack of three values. Instead, what we would like to happen is the following:

execute the instructions in the guard program, producing a value of 0 or 1,
if the guard is 1, continue to evaluate until we finish all instructions in the first sub-program. Once this is done, jump to the end of subprogram 2.
if the guard is 0, jump to the second sub-program and continue to execute instructions.

To capture these two different jump instructions, we have been carrying around two branches in our AST.

JumpIfNotOne n will jump forward by n instructions if the top of the stack is not true (i.e., 1).
Jump n will perform an unconditional jump forward by n instructions.

Our simple ITE-expression will step in the following way:

CalcLang: if true then 10 else 20
StackLang: [Push 1; JumpIfNotOne 2; Push 10; Jump 1; Push 20]

If the guard returns false, or 0, JumpIfNotOne 2 will skip over two instructions: the truthy branch of Push 10 and the subsequent Jump instruction, landing us squarely in the else branch. From there, we can continue to execute as usual. If the guard returns true, we will ignore JumpIfNotOne 2, evaluate the truthy branch, and Jump over one instruction — the size of the else branch.

Looking at the configurations during execution, we would see the following:

- step 0: ⟨[Push 1; JumpIfNotOne 2; Push 10; Jump 1; Push 20],       [], {}⟩
- step 1: ⟨        [JumpIfNotOne 2; Push 10; Jump 1; Push 20],    1::[], {}⟩
- step 2: ⟨                        [Push 10; Jump 1; Push 20],    1::[], {}⟩
- step 3: ⟨                                 [Jump 1; Push 20],10::1::[], {}⟩
- step 4: ⟨                                                [],10::1::[], {}⟩

If we had a guard of false, this program would evaluate the other branch:

- step 0: ⟨[Push 0; JumpIfNotOne 2; Push 10; Jump 1; Push 20],       [], {}⟩
- step 1: ⟨        [JumpIfNotOne 2; Push 10; Jump 1; Push 20],    0::[], {}⟩
- step 2: ⟨                                         [Push 20],    0::[], {}⟩
- step 3: ⟨                                                [],20::1::[], {}⟩

Let's update our definition of step and fill in the cases for Jump and JumpIfNotOne:

let step (cfg : config) : config =
  match cfg with
  | ([],  vs, env) -> cfg
  | (hd::tl, vs, env) ->
    (match hd with
    ...
    | Jump n -> (List.drop n tl, vs, env)
    | JumpIfNotOne n ->
      (match vs with
       | [] -> failwith "Not enough on stack"
       | v :: vs' ->
         if v <> 1
         then (List.drop n tl, vs', env)
         else (tl, vs', env)))

A quick validation with utop corroborates the example we had before:

utop #  execute ([Push 0; JumpIfNotOne 2; Push 10; Jump 1; Push 20]);;
- : value = 20

utop #  execute ([Push 1; JumpIfNotOne 2; Push 10; Jump 1; Push 20]);;
- : value = 10

Next, we'll formalize this reasoning for our calc-to-stack lowering pass:

let rec lower (e : CalcLang.expr) : StackLang.program =
  match e with
  ...
  | ITE (g, t, f) ->
    let guard_code  = lower g in
    let then_code   = lower t in
    let else_code   = lower f in

    let then_length = List.length then_code in
    let else_length = List.length else_code in

    guard_code @ [StackLang.JumpIfNotOne (then_length + 1)] @
    then_code  @ [StackLang.Jump else_length] @
    else_code

With this, we have a code-complete compiler! Let's come up with a few more calc examples that introduce if-then-else expressions in interesting ways.

utop # open CalcLang;;
utop # open StackLang;;

utop # compile(Let ("x", ITE (Value (Bool false), Value (Nat 4), Value (Nat 10)),
                 App(Add, [Var "x"; Value (Nat 4)])));;
- : program = [Push 0; JumpIfNotOne 2; Push 4; Jump 1; Push 10; Set "x#0"; Get "x#0"; Push 4; AppInstr Add]

utop # compile(ITE (Value (Bool true),
            Let("x", Value (Nat 10), Var "x"),
            Let("x", Value (Nat 20), Var "x")));;
- : program = [Push 1; JumpIfNotOne 4; Push 10; Set "x#0"; Get "x#0"; Jump 3; Push 20; Set "x#0"; Get "x#0"]

Compiler Correctness

With our completed compiler, we can begin stating our correctness criteria in OCaml. At a high level, we want to say that any closed Calc expression that evaluates to a value can also be compiled to a StackLang program that produces a semantically equivalent value.

∀ (e : expr) (v : calc_value),
                 eval_opt StrMap.empty e = Some v ⇒ execute (compile e) = (conv v)

Here, eval_opt and execute_opt will convert runtime errors into option types, and conv will convert a calc_value v to its representative integer (our values in StackLang). Let's define our new helper functions.

let eval_opt env e =
  try Some (CalcLang.eval env e)
  with _ -> None

let conv v =
  match v with
  | Nat v -> v
  | Bool v -> if v then 1 else 0

Our proposition can now be stated in OCaml as follows:

let (==>) a b = if a then b else true

let prop_compiler_correct (e : CalcLang.expr) : bool =
  let vo = eval_opt StrMap.empty e in
  (vo <> None) ==>
    (match vo with
    | None -> false
    | Some v -> StackLang.execute (compile e) = conv v)

We have stated properties like this in prior lectures throughout our discussion of programming languages, both at a high-level with a written statement or a low-level in code. All of these prior statements, as well as this code above, is in our implementatino for the lecture and have been property checked. Let's walk through how this works, next.

Property checking compiler correctness.

We can start with the forall function we have seen in prior classes.

let rec forall gen prop num =
    if num <= 0 then None
    else let x = gen () in
         if prop x then forall gen prop (num - 1) else Some x

To actually generate expressions we can use Random.int to select which AST constructor to build.

let mkvar i : string = "x" ^ (string_of_int i)
let gen_expr () : CalcLang.expr =
  let open CalcLang in 
  let rec go depth i =
    let max_k = if depth <= 0 then 2 else 6 in
    match Random.int max_k with
    | 0 -> Value (Nat (Random.int 10))
    | 1 -> Value (Bool (Random.bool ()))
    | 2 -> Var (mkvar i)
    | 3 -> Let (mkvar (i+1), go (depth-1) i, go (depth-1) (i+1))
    | 4 ->
      let op = (match Random.int 6 with
               | 0 -> Add | 1 -> Mul | 2 -> And | 3 -> Or | 4 -> LT | 5 -> EQ
               | n -> failwith ("invalid random selector: " ^ string_of_int n)
               ) in
      App (op,  [go (depth-1) i; go (depth-1) i])
    | 5 -> ITE (go (depth-1) i, go (depth-1) i, go (depth-1) i)
    | n -> failwith ("invalid random selector: " ^ string_of_int n)
  in go 5 0

Finally, we can property check our statement in the usual way:

let _ =
  let samples = 1000000 in
  match forall gen_expr prop_compiler_correct samples with
  | Some ctr -> print_endline "calclang->stacklang: failed"
  | None -> print_endline ("✓ calclang->stacklang: " ^ string_of_int samples ^ " checks pass")